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207 Human Secreted Proteins 
Field of the Invention 

This invention relates to newly identified polynucleotides and the polypeptides 
encoded by these polynucleotides, uses of such polynucleotides and polypeptides, and 
5 their production. 

Background of the Invention 

Unlike bacterium, which exist as a single compartment suiroimded by a 
membrane, human cells and other eucaryotes are subdivided by membranes into many 
functionally distinct compartments. Each membrane-bounded compartment, or 

10 organelle, contains different proteins essential for the function of the organelle. The cell 
uses "sorting signals," which are amino acid motifs located within the protein, to target 
proteins to particular cellular organelles. 

One type of sorting signal, caUed a signal sequence, a signal peptide, or a leader 
sequence, directs a class of proteins to an organelle called the endoplasmic reticulum 

15 (ER). The ER separates the membrane-bounded proteins from all other types of 

proteins. Once localized to the ER, both groups of proteins can be further directed to 
another organelle called the Golgi apparatus. Here, the Golgi distributes the proteins to 
vesicles, including secretory vesicles, the cell membrane, lysosonaes, and the other 
organelles. 

20 Proteins targeted to the ER by a signal sequence can be released into the 

extracellular space as a secreted protein. For example, vesicles containing secreted 
proteins can fuse with the cell membrane and release their contents into the extracellular 
space - a process called exocytosis. Exocy tosis can occur constitutively or after receipt 
of a triggering signal. In the latter case, the proteins are stored in secretory vesicles (or 

25 secretory granules) until exocytosis is triggered. Similarly, proteins residing on the ceU 
membrane can also be secreted into the extracellular space by proteolytic cleavage of a 
"linker** holding the protein to the membrane. 

Despite the great progress made in recent years, only a small number of genes 
encoding human secreted proteins have been identified. These secreted proteins include 

30 the commercially valuable human insulin, interferon. Factor Vm, human growth 
hormone, tissue plasminogen activator, and erythropoeitin. Thus, in light of the 
pervasive role of secreted proteins in himian physiology, a need exists for identifying 
and characterizing novel human secreted proteins and the genes that encode them. This 
knowledge will allow one to detect, to treat, and to prevent medical disorders by using 

35 secreted proteins or the genes that encode them. 
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Summary of the Invention 

The present invention relates to novel polynucleotides and the encoded 
polypeptides. Moreover, the present invention relates to vectors, host cells, antibodies, 
5 and recombinant methods for producing the polypeptides and polynucleotides. Also 
provided are diagnostic methods for detecting disorders related to the polypeptides, and 
therapeutic methods for treating such disorders. The invention further relates to 
screening methods for identifying binding partners of the polypeptides. 

10 Detailed Description 

Definitions 

The following definitions are provided to facilitate understanding of certain 
terms used throughout this specification. 

In the present invention, "isolated" refers to material removed from its original 

15 environment (e.g., the natural environment if it is naturally occurring), and thus is 
altered "by the hand of man" fix)m its natural state. For example, an isolated 
polynucleotide could be part of a vector or a composition of matter, or could be 
contained within a cell, and still be "isolated" because that vector, composition of 
matter, or particular ceU is not the original environment of the polynucleotide. 

20 In the present invention, a "secreted" protein refers to those proteins capable of 

being directed to the ER, secretory vesicles, or the extracellular space as a result of a 
signal sequence, as well as those proteins released into the extracellular space without 
necessarily containing a signal sequence. If the secreted protein is released into the 
extracellular space, the secreted protein can undergo extracellular processing to produce 

25 a "mature" protein. Release into the extracellular space can occur by many 
mechanisms, including exocytosis and proteolytic cleavage. 

As used herein , a "polynucleotide" refers to a molecule having a nucleic acid 
sequence contained in SEQ ID NO:X or the cDNA contained within the clone deposited 
with the ATCC. For example, the polynucleotide can contain the nucleotide sequence 

30 of the fiill length cDNA sequence, including the 5* and 3* untranslated sequences, the 
coding region, with or without the signal sequence, the secreted protein coding region, 
as well as fragments, epitopes, domains, and variants of the nucleic acid sequence. 
Moreover, as used herein, a "polypeptide" refers to a molecule having the translated 
amino acid sequence generated from the polynucleotide as broadly defined. 

35 In the present invention, the full length sequence identified as SEQ ID NO:X 

was often generated by overlapping sequences contained in multiple clones (contig 
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analysis). A representative clone containing all or most of the sequence for SEQ ID 
NO:X was deposited with the American Type Culture Collection ("ATCC"). As 
shown in Table 1, each clone is identified by a cDNA Clone ID (Identifier) and the 
ATCC Deposit Number. The ATCC is located at 10801 University Boulevard, 
5 Manassas, Virginia 201 10-2209, USA. The ATCC deposit was made pursuant to the 
terms of the Budapest Treaty on the international recognition of the deposit of 
microorganisms for purposes of patent procedure. 

A "polynucleotide" of the present invention also includes those polynucleotides 
capable of hybridizing, under stringent hybridization conditions, to sequences contamed 
10 in SEQ ID NO:X, the complement thereof, or the cDNA within the clone deposited with 

the ATCC. "Stringent hybridization conditions" refers to an overnight incubation at 42° 

C in a solution comprising 50% foimamide, 5x SSC (750 mM NaCl, 75 mM sodium 
ciuale), 50 mM sodium phosphate (pH 7.6), 5x Denhardt's solution, 10% dextran 
sulfate, and 20 Jlg/ml denatured, sheared salmon sperm DNA, followed by washing the 
1 5 filters in 0. 1 X SSC at about GS^'C. 

Also contemplated are nucleic acid molecules that hybridize to the 
polynucleotides of the present invention at lower stringency hybridization conditions. 
Changes in the stringency of hybridization and signal detection are primarily 
accomplished through the manipulation of formamide concentration (lower percentages 
20 of formamide result in lowered stringency); salt conditions, or temperature. For 

example, lower stringency conditions include an overnight incubation at 3TC in a 

solution comprising 6X SSPE (20X SSPE = 3M NaCl; 0.2M NaH^PO^, 0.02M EDTA, 
pH 7.4), 0.5% SDS, 30% formamide, 100 ug/ml salmon sperm blocking DNA; 

followed by washes at 50X with IXSSPE, 0.1% SDS. In addition, to achieve even 

25 lower stringency, washes performed following stringent hybridization can be done at 
higher salt concentrations (e.g. 5X SSC). 

Note that variations in the above conditions may be accomplished through the 
inclusion and/or substitution of alternate blocking reagents used to suppress 
background in hybridization experiments. Typical blocking reagents include 

30 Denhardfs reagent, BLOTTO, heparm, denatured sahnon sperm DNA, and 

commercially available proprietary formulations. The inclusion of specific blocking 
reagents may require modification of the hybridization conditions described above, due 
to problems with compatibility. 

Of course, a polynucleotide which hybridizes only to poly A+ sequences (such 

35 as any 3* terminal polyA+ tract of a cDNA shown in the sequence listing), or to a 
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complementary stretch of T (or U) residues, would not be included in the definition of 
"polynucleotide," since such a polynucleotide would hybridize to any nucleic acid 
molecule containing a poly (A) stretch or the complement thereof (e.g., practically any 
double-stranded cDNA clone). 
5 The polynucleotide of the present invention can be composed of any 

polyribonucleotide or polydeoxribonucleotide, which may be unmodified RNA or DNA 
or modified RNA or DNA. For example, polynucleotides can be composed of single- 
and double-stranded DNA, DNA that is a mixture of single- and double-stranded 
regions, single- and double-stranded RNA, and RNA that is mixture of single- and 

10 double-stranded regions, hybrid molecules comprising DNA and RNA that may be 
single-stranded or, more typically, double-stranded or a mixture of single- and double- 
stranded regions. In addition, the polynucleotide can be composed of triple-stranded 
regions comprising RNA or DNA or both RNA and DNA. A polynucleotide may also 
contain one or more modified bases or DNA or RNA backbones modified for stability 

15 or for other reasons. "Modified" bases include, for example, tritylated bases and 
unusual bases such as inosine. A variety of modifications can be made to DNA and 
RNA; thus, "polynucleotide" embraces chemically, enzymatically, or metabolically 
modified forms. 

The polypeptide of the present invention can be composed of amino acids joined 

20 to each other by peptide bonds or modified peptide bonds, i.e., peptide isosteres, and 
may contain amino acids other than the 20 gene-encoded amino acids. The 
polypeptides may be modified by either natural processes, such as posttranslational 
processing, or by chemical modification techniques which are well known in the art. 
Such modifications are well described in basic texts and in more detailed monographs, 

25 as well as in a voluminous research literature. Modifications can occur anywhere in a 
polypeptide, mcluding the peptide backbone, the amino acid side-chains and the amino 
or carboxyl termini. It will be appreciated that the same type of modification may be 
present m the same or varying degrees at several sites in a given polypeptide. Also, a 
given polypeptide may contain many types of modifications. Polypeptides may be 

30 branched , for example, as a resuh of ubiquitination, and they may be cychc, with or 
without branching. Cyclic, branched, and branched cychc polypeptides may result 
from posttranslation natural processes or may be made by synthetic methods. 
Modifications include acetylation, acylation, ADP-ribosylation, amidation, covalent 
attachment of flavin, covalent attachment of a heme moiety, covalent attachment of a 

35 nucleotide or nucleotide derivative, covalent attachment of a lipid or Upid derivative, 
covalent attachment of phosphotidylinositol, cross-Unking, cyclization, disulfide bond 
formation, demethylation, formation of covalent cross-hnks, formation of cysteine, 
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formation of pyroglutamate, foraiylation, gamma-carboxylation, glycosylation, GPI 
anchor formation, hydroxylation, iodination, methylation, myristoylation, oxidation, 
pegylation» proteolytic processing, phosphorylation, prenylation, racemization, 
selenoylation, sulfation, transfer-RNA mediated addition of amino acids to proteins 
5 such as arginylation, and ubiquitination. (See, for instance, PROTEINS - 

STRUCTURE AND MOLECULAR PROPERTffiS, 2nd Ed., T. E. Creighton,W. 
H. Freeman and Company, New York (1993); POSTTRANSLATIONAL 
COVALENTMODMCATION OF PROTEINS, B. C. Johnson, Ed., Academic 
Press, New York, pgs. 1-12 (1983); Seifter et al., Meth Enzymol 182:626-646 (1990); 
10 Rattan et al., Ann NY Acad Sci 663:48-62 (1992).) 

"SEQ ID NO:X" refers to a polynucleotide sequence while "SEQ ID NO: Y" 
refers to a polypeptide sequence, both sequences identified by an integer specified in 
Table 1. 

"A polypeptide having biological activity" refers to polypeptides exhibiting 
1 5 activity similar, but not necessarily identical to, an activity of a polypeptide of the 

present invention, including mature forms, as measured in a particular biological assay, 
with or without dose dependency. In the case where dose dependency does exist, it 
need not be identical to that of the polypeptide, but rather substantially similar to the 
dose-dependence in a given activity as compared to the polypeptide of the present 
20 invention (i.e., the candidate polypeptide will exhibit greater activity or not more than 
about 25-fold less and, preferably, not more than about tenfold less activity, and most 
preferably, not more than about three-fold less activity relative to the polypeptide of the 
present invention.) 

25 Polynucleotides and Poivpeptides of the Invention 

FEATURES OF PROTEIN ENCODED BY GENE NO: 1 

This gene is expressed primarily in melanocytes and, to a lesser extent, in 
testes, ovary, kidney and other tissues. 

30 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancer, disorders of neural crest derived cells including pigmentation 
defects, melanoma, reproductive organ defects, and defects of the kidney. Similarly, 

35 polypeptides and antibodies directed to these polypeptides are useful in providing 

immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the skin. 
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reproductive, and renal systems, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
5 the standard gene expression level, i.e., the expression level in healdiy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treating disorders that arise from alterations in 
the number or fate of neural crest derived cells including cancers such as melanoma and 
10 defects of the developing reproductive system. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 2 

This gene is expressed primarily in infant brain and fetal lung. 
Therefore, polynucleotides and polypeptides of the invention are useful as 

15 reagents for differential identification of the tissue{s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental disorders of the brain or lung. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type{s). For a number of disorders 

20 of the above tissues or cells, particularly of the central nervous and pulmonary systems, 
expression of this gene at significandy higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., senmi, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 

25 the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treating or diagnosing disorders associated 
with abnormal proliferation of ceUs in the Central nervous system and developing lung. 

30 

FEATURES OF PROTEIN ENCODED BY GENE NO: 3 

This gene is expressed primarily in breast lymph node and to a lesser extent in 

ovarian cancer and chondrosarcoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
35 reagents for differential identification of the tissue(s) or cell type(s) present m a 

biological sample and for diagnosis of diseases and conditions, which include, but are 

not limited to, immune responses such as inflammation or inmiune surveillance for 



wo 98/54963 



PCT/US98/11422 



7 

tumors. This gene may be important for inflammatory responses associated with 
tumors. Similarly, polypeptides and antibodies directed to these polypeptides are useful 
in providing inmiunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
5 immune system, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
10 individual not having the disorder. Preferred epitopes include those comprising a 

sequence shown in SEQ ID NO: 236 as residues: Lys-45 to Val-50, Lys-69 to Arg-76. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene arc useful for treatment or diagnosis of immune responses 
including those associated with tumor-induced inflammation. 

15 

FEATURES OF PROTEIN ENCODED BY GENE NO: 4 

This gene is expressed primarily in T-cells and T-cell lymphomas. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for diflierential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but arc 
not limited to, iramunilogical diseases involving T-cells such as inflammation, 
autoimmunity, and cancers including T^ell lymphomas. Similarly, polypeptides and 
antibodies directed to these polypeptides arc usefiil in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 

25 of the above tissues or cells, particularly of T-cells and other cells of the immune 

system, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodOy fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken fix)m an individual having such a disorder, relative to the standard gene 

30 expression level, i.e., the expression level in healthy tissue or bodily fluid fix)m an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosing and treating T-cell based disorders 
such as inflammatory diseases, autoinunune disease and tumors including T-cell 

35 lymphomas. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 5 

This gene is expressed primarily in activated monocytes. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
5 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammation, autoinmnmity, infection, or disorders involvmg activation 
of monocytes. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue{s) 
or cell type(s). For a number of disorders of die above tissues or cells, particularly of 
10 the immune system, expression of this gene at significantiy higher or lower levels may 
be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
15 individual not having die disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 238 as residues: Asp- 19 to Arg-31. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosing or treating diseases that result in 
^tivation of monocytes including infections, inflanunatory responses or autoimmune 
20 diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 6 

The translation product of this gene shares sequence homology with terminal 
deoxynucleotidyltransferase which is thought to be important in catalyzing the 

25 elongation of oligo- or polydeoxynucleotide chains. 

This gene is expressed primarily in activated human neutrophils. 
Therefore, polynucleotides and polypeptides of the invention arc useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 

30 not limited to, cancer, particularly those of the blood such as leukemia and deficiencies 
in neutrophils such as neutropenia Similarly, polypeptides and antibodies directed to 
these polypeptides arc useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the cardiovascular system, expression of this gene at 

35 significantiy higher or lower levels may be routinely detected in certain tissues (e,g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken fipom an individual having 
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such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to terminal deoxynucleotidyltransferase 
indicates that polynucleotides and polypeptides corresponding to this gene are useful for 
5 treatment and differential diagnosis of acute leukemia's. Alternatively, this gene may 
function in the proliferation of neutrophils and be useful as a treatment for neutropenia, 
for example, following neutropenia as a result of chemotherapy. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 7 

10 The contig exhibits a reasonable homology to the human chorionic gonadotropic 

(HCG) analogue-GT beta-subunit as disclosed in U.S. Patent No. 5,508,261 and PCT 
Publication No. WO 92/22568. There is a high degree of conservation of the 
structuraUy important cysteine residues in these identities. 

This gene is expressed primarily in IL-1 and LPS induced neutrophils. 

15 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to. diseases of the immune system, including inflanmiatory diseases and 
allergies. Similarly, polypeptides and antibodies directed to these polypeptides are 

20 useful in providing inmiunological probes for differential identification of the tissue(s) 
or ceU type(s). For a number of disorders of the above tissues or cells, particularly of 
the immune system, expression of this gene at significantly higher or lower levels may 
be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 

25 cell sample taken from an individual having such a disorder, relative to the standard 

gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatnoent/diagnosis of diseases of the immune 

30 system since expression is primarily in neutrophils, and may be useful as a growth 
factor for the differentiation or proliferation of neutrophils for the treatment of 
neutropenia following chemother^y. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 8 
35 This gene is expressed priniarily in IL-1- and LPS-mduced neutrophils. 

Therefore, polynucleotides and polypeptides of the invention arc useful as 
reagents for differential identification of the tissue(s) or ceU type(s) present in a 
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biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of the immune system, including mflammatory diseases and 
allergies. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
5 or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the immune system, expression of this gene at significantly higher or lower levels may 
be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 

10 gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 241 as residues: Ser-14 to Pro-22, Leu-43 to Val-53. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment and diagnosis of diseases of the 

15 immune system since expression is primarily in neutrophils, and may be useful as a 
growth factor for the differentiation or proliferation of neutrophils for the treatment of 
neutropenia following chemother^y. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 9 

20 This gene is expressed primarily in IL- 1 and LPS induced neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not hmited to, diseases of the immune system, including inflammatory diseases and 

25 allergies. Similarly, polypeptides and antibodies directed to these polypeptides are 

useful m providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the immune system, expression of this gene at significantly higher or lower levels may 
be routinely detected in certain tissues (e.g„ cancerous and wounded tissues) or bodily 

30 fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid ftom an 
individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 242 as residues: Tyr-22 to His-35. 

35 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for treatment/diagnosis of diseases of the unmune 
system since expression is primarily in neutrophils, and may be useful as a growth 
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factor for the differentiation or proliferation of neutrophils for the treatment of 
neutropenia following chemotherapy. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 10 
5 This gene is expressed primarily in activated T-cells and to a lesser extent in 

endothelial cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

10 not limited to, inunune dysfunctions including cancer of the T lymphocytes and 
autoimmune disorders and inflanmiation. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing inununological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the immune system, expression of this gene at 

15 significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken fix)m an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid fh>m an individual not having the disorder. 

20 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for treatment and diagnosis of inmiune disorders 
particularly of T-cell origin and may act as a growth factor for particular subsets of T- 
ceUs such as CD4 positive cells which would make this a useful therapeutic for the 
treatment of HTV and other immune compromising illnesses. 

25 

FEATURES OF PROTEIN ENCODED BY GENE NO: 11 

This gene is expressed primarily in fetal tissue. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

30 biological sample and for diagnosis of many developmental abnormalities. Similarly, 
polypeptides and antibodies directed to these polypeptides arc useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or ceDs, particularly of the developing fetus, 
expression of this gene at significantly higher or lower levels may be routinely detected 

35 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
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the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful as a growth factor or differentiation factor for 
5 particular cell types in the developing fetus and may be useful in replacement or other 
types of therapy in cases where the gene is expressed aberrantly. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 12 

This gene is expressed primarily in T-cells and to a lesser extent in tumor tissue 

10 including glioblastoma, meningioma, and Wilm*s tumor. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of the immune system including autoimmune conditions such as 

15 rheumatoid arthritis, inflammatory disorders and cancer. Similarly, polypeptides and 
antibodies directed to these polypeptides arc usefid in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the inmiune, expression of this gene at 
significantly higher or lower levels may be nmtinely detected m certain tissues (e.g., 

20 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken finom an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequoice shown in SEQ ID NO: 245 as residues: 

25 Thr-9 to Ser-14. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis/ modulation of inomune function 
disorders, including rheimiatoid arthritis and inflammatory responses. 

30 FEATURES OF PROTEIN ENCODED BY GENE NO: 13 

This gene is expressed primarily in placenta and to a lesser extent in fetal liver 
and bone marrow. 

Therefore, polynucleotides and polypeptides of the invention are usefiil as 
reagents for differential identification of die tissue(s) or cell type(s) present in a 
35 biological sample and for diagnosis of hematological disorders. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing inmiunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
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disorders of the above tissues or cells, particularly of the hematological and immune 
systems, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
5 taken from an individual having such a disorder, relative to the standard gene 

expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful as a growth factor for hematapoietic stem cells or 
10 progenitor cells in the treatment of chemotherapy patients or kidney disease. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 14 
This gene is expressed primarily in stromal cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
15 reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of hematapoietic disorders including cancer, 
neutropenia, anemia, and thrombocytopenia. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
20 the above tissues or cells, particularly of the hematapoietic and inmiune, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken fh)m an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
25 the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful as a growth factor for hematapoietic stem cells or 
progenitor cells, in particular following chemotherapy treatment. 

30 

FEATURES OF PROTEIN ENCODED BY GENE NO: 15 

The translation product of this gene shares sequence homology with epsilon- 
COP from Bos taurus which is thought to be important as a component of coatomer, a 
complex of seven proteins, that is the major component of the non-clathrin membrane 
35 coat. Preferred polypeptides encoded by this gene comprise the following amino acid 
sequences: 

MAPPAPGPASGGSGEVDELJroVKNAFYIGSYQQaNEAX^ 
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VFLYRAYl^QRKFGVVLDEIKPSSAPELQAVRMFADYLAHESR^ 
MSRSXDVT^^TFUMAASIYLHIX^N^ 
RLDLARKELKRMQDIX>EDATLTQ1ATAWS1ATGGEK^ 
PTLLUJSfGQAACHMAQGRWEAAEGIiQEALDK^^ 
5 PEVTNRYI^QLKDAHRSHPFIKEYQAKE^roFDRLVL^ 

(SEQ ID NO:458); or RDVERDVFLYRAYI^QRKFGVVLDEIKPSS APELQAVRA^ 
ADYLAHESRRDSIVAELDREMSRSXDVTISr^^ 
QGDS1£CTAMIVQILLKIJ)RIJDL^^ 
GEKUJDAYYIFQEMADKCSPTUJJJSrG^ 

10 SGYPETLVNLIVl^QHLGKPPEVTNRYI^QLKDAHRSH^ 
VLQYAPSA (SEQ ID NO:459). 

This gene is expressed primarily in activated monocytes and T-cells, and to a 
lesser extent in multiple other tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

15 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immunomodulation, specifically relating to transport problems in these 
cells. Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 

20 type(s). For a number of disorders of the above tissues or cells, particularly of the 

inmiune, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 

25 expression level, i.e., the expression level in healthy tissue or bodily fluid ftom an 
individual not having the disorder. 

The tissue distribution and homology to epsilon-COP indicates that 
polynucleotides and polypeptides corresponding to this gene are usefiil for treating 
/diagnosing problems with the cellular transport of proteins that may result in 

30 inmiunologic dysfunction. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 16 

The translation product of this gene shares sequence homology with an RNA 
helicase which is thought to be important in polynucleotide njetabolism. The translation 
35 product of this contig exhibits good homology to the LbeIF4A antigen of Leishmania 
braziliensis. The LbeIF4A antigen, or immunogenic portions of it, can be used to 
induce protective inununity against leishmaniasis, specifically L. donovani, L. chagasi, 
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L. infantum, L. major, L. braziliensis, L. panamensis, L. tropica and L. guyanensis. It 
can also be used diagnoslically to detect Leishmania infection or to stimulate a cellular 
and/or humoral inunune response or to stimulate the production of interleukin-12. 
This gene is expressed primarily in colon cancer and to a lesser extent in 
5 pituitary. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of cancers particularly of the colon. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 

10 inmiunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the gastrointestinal 
system, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 

15 taken from an individual having such a disorder, relative to the standard gene 

expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 249 as residues: Glu-93 to Ala-98, Gln-150 to Leu- 
156, Leu-220 to Leu-231, Leu-268 to Arg-273, Val-324 to Pro-341, Arg-372 to Asn- 

20 380, Ser-405 to Gly^lO, Phe^26 to Alar433, Glu-458 to Asp-470, Arg-506 to Ser- 
547. 

The tissue distribution and homology to RNA helicase indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for development 
of diagnostic tests for colon cancer. 

25 

FEATURES OF PROTEIN ENCODED BY GENE NO: 17 

The translation product of this contig has sequence homology to a cytoplasmic 

protein that binds specifically to JNK designated the JNK interacting protein-1 or JIP-1 

in mice. JIP-1 caused cytoplasmic retention of JNK and inhibition of JNK-regulated 
30 gene expression. 

This gene is expressed primarily in brain including pituitary cerebellum frontal 

cortex, fetal brain and to a lesser extent in the kidney cortex. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
35 biological sample and for diagnosis of the central nervous system disorders including 

ischemia, epilepsy, Parkinson's disease, and schizophrenia. Similarly, polypeptides 

and antibodies directed to these polypeptides are useful in providing inmiunological 
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probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the central nervous system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
5 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Furthermore, the translation product of this contig may suppress die effects of 
the JNK signaling pathway on cellular proliferation, including transformation by the 
10 Bcr-Abl oncogene. Preferred epitopes include those comprising a sequence shown in 
SEQ ID NO: 250 as residues: Pro-6 to Ser-26, Ala-30 to Asp-41, Gly-55 to Ser-61, 
Gly-74 to Thr-80, Tyr-ll? to Ala-123, Tyr-167 to Asp-172, Ala-212 to Cys-223, Pro- 
239toTyr-244. 

The tissue distribution indicates that polynucleotides and polypeptides 
15 corresponding to this gene are useful for enhanced survival and/or differentiation of 
neurons as a treatment for neurodegenerative disease. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 18 

The translation product of this gene shares sequence homology with a liver 

20 stage antigen from a protozoan parasite. 

This gene is expressed primarily in fetal tissue and to a lesser extent in activated 
T-cells and other irmnune cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

25 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental abnormalities and diseases of immune function. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune system, 

30 expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

35 disorder. 
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The tissue distribution and homology to a protozoan antigen indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for 
treatmentAunmune modulation of parasitic infections. 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 19 

Preferred polypeptide encoded by this gene comprise the following polypeptide 
sequences: 

MKAIGIEPSLATYHHIIRUT^QPGDPLKRSSFIIYDI^ 

PDDDKFFQSAMSICSSLRDLEJ^YQVHGLLKTGDNWKHGPDQHRNFYYS^ 

10 DU<XMEQro\TLKWYEDLIPSAYFPHSQTMI^ 

(SEQ ID NO:460); and/or KDSKEYGHTFRSDLREEILMLMARDKHPPELQVAF 
ADCAADIKSAYESQPIRQTAQDWPATSmOAIIJ^RAGRTQEAWKNI^ 
NKIPRSELUSIELMDSAKVSNSPSQAffiWELASAFSLPI^ 
EQKEALSNLTALTSDSDTDSSSDSDSDTSEGK (SEQ ID NO:461). PolynucleoUdes 

15 encoding such polypeptides are also provided. 

This gene is expressed primarily in stromal and CD34 depleted bone marrow 
cells and to a lesser extent in tissues of embryonic origin. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of hematologic origin including cancers and inmiune 
dysfunction. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 

25 the hematapoietic and immune, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell san^)le taken from an individual having such a disorder, relative to 
the standard gene expression level, i,e., the expression level in healthy tissue or bodily 

30 fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 252 as residues: Ser-28 to Ghi-34. 

The tissue distribution indicates that polynucleotides and polypeptides 
correspondmg to this gene are useful as a growth factor for hematopoietic stem cells or 
progenitor ceUs which may be useful in the treatment of chemotherapy patients 

35 suffering from neutropenia. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 20 

Preferred polypeptide fragments can be found in an alternative open reading 
frame. These preferred polypeptides comprise the amino acid sequence: 
MSSDNESDEDEDLKl^LRRLRDKHLmQDLQSRQKHEIESLYTKI^ 
5 IPPAAPI^GRRRRPTKSKGSKSSRSSSUjNKSPQI^GNI^GQSAASVUIPQQT^ 
HPPGNIPESGQNQLLQPLKPSPSSDNLYSAFTSDGAISVPSLSAPGQGTSSTNTV 
GATVNSQAAQAQPPAMTSSRKGTFTODUIKLVDNWARDAMNLSGRRGSKGH 
MNYEGPGMARKFSAPGQmSMTSNLGGSAPISAASATSLGHFTKSMCPPQQY 
GFPATPFGAQWSGTGGPAPQPLGQFQPVGTASLQNFNISNLQKSISNPPGSNL 

10 RTT (SEQ id N0:462); IQDLQSRQKHEDESLYTKLGKVPPAVIIPPAAPLSGRRRR 
PTKSKGSKSSRSSSLGNKSPQLSGNLSGQSAASVLHPQQTLHPPGNIPESGQN 
QLLQPLKPSPSSDNLYSAFTSDGAIS VPSLSAPGQGTSST (SEQ ID NO:463); 
TSDGAIS VPSLS APGQGTSSTNTVGATVNSQAAQAQPPAMTSSRKGTFTODLH 
(SEQ ID NO:464); KGHMNYEGPGMARKFSAPGQLCISMTSNLGGSAPISAAS 

15 ATSLGHFTK (SEQ ID NO:465); QPLKPSPSSDNLYSAFTSDGAISVPSLSAPG 
(SEQ ID NO:466). Also preferred are polynucleotide fragments encoding these 
polypeptide fragments. 

This gene is expressed in fetal liver and tissues associated with the CNS. 
Therefore^ polynucleotides and polypeptides of the invention are usefril as 

20 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, liver and CNS diseases. Similarly, polypeptides and antibodies directed 
to these polypeptides are useful in providing inmitmological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 

25 tissues or cells, particularly of the liver and CNS, expression of this gene at 

significantiy higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 

30 in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include tiiose conq)rising a sequence shown in SEQ ID NO: 253 as residues: 
Gln-26 to Lys-34. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are usefid for diagnosis and treatment for liver diseases such 

35 as hepatoceUular carcinomas and diseases of the CNS. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 21 

In an alternative reading frame, this gene shows sequence homology to two 
recently cloned genes, karyopherin beta 3 and Ran_GTP binding protein 5. (See 
Accession Nos. gil2102696 and gnllPIDIe328731.) The Ran.GTP binding protein is 
5 related to importin-beta, the key mediator of nuclear localization signal (NLS)- 

dependent nuclear transport. Based on homology, it is likely that this gene may activity 
similar to the RAN_GTP binding protein. Preferred polypeptide fragments comprise the 
amino acid sequence: WVAAAESNIXLlJ^CAXWGPEYLTQNrWHFM 
IGTEPDSDVLSEIMHSFAK (SEQ ID NO:467). Also preferred are polynucleotide 
1 0 fragments encoding these polypeptide fragments. 

This gene is expressed in thymus tissue. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present m a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

15 not limited to, immune disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing inununological probes for diffi^ntial 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 

20 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

25 corresponding to this gene are useful for diagnosis and treatment for immune disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 22 

This gene is expressed primarily in prostate and osteoclastoma tissues. 

Preferred polypej^de fragments also comprise the amino acid sequence: 
30 MEINNQNaTVaDLVRTVMENGVEGLUFGAFlJ>ES\^ 

AHSQKRRLDGWSFIRHLRVHYCVSLTIHFS (SEQ ID NO:468). Also ptefeircd are 

polynucleotide sequences encoding this polypeptide fragment. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
35 biological sample and for diagnosis of diseases and conditions, which include, but are 

not limited to, bone and prostate diseases, and cancers, particularly of the bone and 

prostate. Similarly, polypeptides and antibodies directed to these polypeptides are 
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useful in providing inununological probes for differential identification of the.tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the bone and prostate systems, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
5 tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 255 as residues: Met-1 to Ser-1 1. 
10 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useftd for diagnosis and treatment for bone and prostate 
disorders, especially cancers of those systems. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 23 

1 5 This gene shares sequence homology with the FK506-binding protein (FKBP- 

13) family, a known cytosolic receptor for the inmiunosuppressants. Recently, another 
group has cloned a very similar gene, recognizing the homology to FK506-binding 
protein family, calling their gene FKBP23. (See Accession No. 2827255.) 
This gene is expressed primarily in lynq)hoid tissues. 

20 Therefore, poljmucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample, especially for those susceptible to inunune suppressant therapies and 
for diagnosis of diseases and conditions, which include, but are not limited to, immune 
suppressant disord^. Similarly, polypeptides and antibodies directed to these 

25 polypeptides are useful in providing inmiunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 

30 another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 256 as residues: AIa-19 to Val-31, Arg- 
38 to Gly-49, Ala-61 to Lys-66, Tyr-68 to Pro-78, Gly-1 16 to Ala-121, Asp-154 to 

35 Ser-162, Glu-173 to Ghi-186, Phe-194 to Gly-203, Pro-207 to Val-212. 
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The tissue distribution and homology to FKBP- 1 2 and - 1 3 mdicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 
and treatment for immune suppressant disorders. 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 24 

This gene is expressed primarily in the brain and in the retina. This gene maps 
to chromosome 8, and therefore can be used in linkage analysis as a marker for 
chromosome 8. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

10 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurological and ocular associated disease states. Similarly, polypeptides 
and antibodies directed to these polypeptides are usefiii in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 

1 5 disorders of the above tissues or cells, particularly of the disorders of the central 

nervous system, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 

20 gene expression level, i.e„ the expression level in healthy tissue or bodily fluid firam an 
individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 257 as residues: Cys-34 to Asp-40. 

The tissue distribution in r^na indicates that polynucleotides and polypeptides 
corresponding to this gene arc useflil for the treatment and/or detection of eye disorders 

25 including blindness, color blindness, impaired vision, short and long sightedness, 
retinitis pigmentosa, retinitis proliferans, and retinoblastoma. Expression in the brain 
indicates a role in the is useful for the detection/treatment of neurodegenerative disease 
states and behavioral disorders such as Alzheimer's Disease, Paricinson's Etisease, 
Huntington's Disease, schizophrenia^ mania, dementia, paranoia, obsessive compulsive 

30 disorder and panic disorder. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 25 

This gene shows sequence homology to a newly identified class of proteins 
expressed in the nervous system, called stathmin family. (See Accession No. 2585991; 
35 see also Eur. J. Biochem. 248 (3), 794-806 (1997).) The stathram family appears to be 
an ubiquitous phosphoprotein involved as a relay integrating various intraceUular 
signaling pathways. These pathways affect ceU proliferation and differentiation. 
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Preferred polypeptide fragments comprise the amino acid sequence: 
QDKHAEEVRKNKELKEEASR (SEQ ID NO:469); QQDLSPWAAPVGCPLXXASX 
TCHXLPI^GCLRRQSXSLPVVAXLCWFSCPLASU^VPGQP<^ 
KHAEEVRKNKELKEEASR(SEQIDNO:470). Also preferred are the 
5 polynucleotide fragments encoding these polypeptide fragments. 
This gene is expressed primarily in brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

10 not limited to, neurological disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing inmiunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the central nervous system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 

15 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

20 corresponding to this gene are useful for the detection/treatment of neurodegenerative 
disease states and behavioral disorders such as Alzheimer's Disease, Parkinson's 
Disease, Huntintons Disease, schizophrenia, mania, dementia, paranoia, obsessive 
compulsive disorder and panic disorder. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 26 

The polynucleotide sequence of this gene contains a domain similar to a Flt3 
ligand peptide. Preferred polypeptide fragments comprise the amino acid sequence: 
FmCCTTQPCRSSARRPCWVPMVPSPEGREXQPTCPS(SEQroNO:471). Thus, 
this gene may have activity as binding to Ht3 receptors, a process known to promote 
30 angiogenesis and/or lymphangiogenesis. 

This gene is expressed in human tonsil, and to a lesser extent in 
teratocarcinoma, placenta, colon carcinoma, and fetal kidney. 

Therefore, polynucleotides and polypeptides of the invention are useftil as 
reagents for identification of the tissue(s) or cell type(s) present in a biological sample 
35 and for diagnosis of diseases and conditions, which include, but are not limited to, 
diseases of the tonsil, as well as cancers, such as colon, reproductive, and kidney 
cancers. Similarly, polypeptides and antibodies directed to these polypeptides are useful 



wo 98/54963 



PCT/US98/11422 



23 

in providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
tonsils, colon, reproductive organs, and kidneys, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken firom an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 259 as residues: 
Pro-22 to Glu-33. 

The tissue distribution in tonsil and several cancers and fetal tissues indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and treatment of diseases of the tonsil or colon, such as tonsillitis, 
inflanmiatory diseases involving nose and paranasal sinuses, especially during the 
infection of influenza, adenoviruses, parainfluenza, rhinoviruses. The gene may also be 
useful in the diagnosis and treatment of neoplasms of nasopharynx or colon origins. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 27 

In an alternative reading frame exists a large open reading frame that encodes a 
preferred polypeptide. Preferred polypeptide fragments comprise the amino acid 
sequence: 

MKRSUSfENSARSTAGClPWm^IQKKRNRQPLTS^ 

PUTDWAWEAVNPEXAPVMKT\^TGQIPHSVSRP1^ 

GGWSYRDGNKNTSLKTWXKNDFKPQCKRTNL^ 

QIJRTPEPPOTJSRNKETEUJRQTHSSKISGCI^ 

KXQMlJ)DIPEDNTLKETSLYQI^FKEKASSUmSAVIESM 

FEVI^VmSAVTPGPYYSKTFIJ^lRIXJ 

NYDQKKNIFCJCVS VRPAS VSEQKTFQAFV^ (SEQ ID 

NO:472); SQDSVFNSI(JSNTGRSQGGWSYRIX3NKOTSLKTWXK^ 

(SEQ ID NO:473); NKETCLLRQTHSSKISGCTMRGLDKNSALQTLKPNF (SEQ ID 

NO:474);SSLRIISAVIESMKYWREHAQKTVIJJ^EV^ 

(SEQ ID NO:475); and PRURGRVHRCVGNYDQKKNIFQCVS VRPASVSEQKT 
FQAFV (SEQ ID NO:476). 

This gene is expressed primarily in human testes. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or ceU type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
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not limited to, male reproductive disorders, including cancer. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or ceUs, particularly of the male reproductive system, 
5 expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
10 disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful as a hormone with reproductive or other systemic 
functions; contraceptive development; male infertility of testicular causes, such as 
Kleinfelteris syndrome, varicocele, orchitis; male sexual dysfunctions; testicular 
15 neoplasms; and inflammatory disorders such as epididymitis. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 28 
This gene is expressed primarily in apoptotic T-cell. 
Therefore, polynucleotides and polypeptides of the invention are useful as 

20 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases relating to T cells, as well as cancer in general. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
inununological probes for differential identification of the tissue(s) or cell type(s). For 

25 a number of disorders of the above tissues or cells, particularly of the disorders of the 
immime system, expression of this gene at significantly higher or lower levels may be 
routinely detected m certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken fix)m an individual having such a disorder, relative to the standard 

30 gene expression level, i.e., the expression level in healthy tissue or bodily fluid ftom an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for immune disorders. Moreover, since the gene 
was isolated bom an apoptotic cell and based on the understanding of the relationship 

35 of apoptosis and cancer, it is likely that this gene may play a role in the genesis of 
cancer. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 29 

This gene is expressed primarily in human tonsils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, gastrointestinal disorders. Similarly, polypeptides and antibodies directed 
to these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
10 tissues or cells, particularly of the gastrointestinal system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or ceil sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
15 in healthy tissue or bodily fluid fh)m an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are usefvd for the diagnosis and treatment of gastrointestinal 
diseases. 

20 FEATURES OF PROTEIN ENCODED BY GENE NO: 30 

The translation product of this gene shares sequence homology with C44C1.2 
gene product of Caenorfaabditis elegans with unknown function. Preferred polypeptide 
fiagments comprise the amino add sequence: 
GVFRPCVCGRPASLTCSPU)PEVGPYCDTPITS4RTLFN1^ 

25 SDAKKAASKTLI£KSQFSDKPVQDRGLVVTDLKAESVVIJ^^ 
FAGDVUjYVTPWNSHGYDVTKVFGSKFTQISPVWLQL^ 
XaXJGWMRAVRKHAKGUnWRUJ^)^^ 
KNQHFDGFVVEVWNQUJSQKRVGUHMLTH^ 
DQIXjMFTHKEFEQIJ^VIJXJFSIMI^ 

30 KXKWRTKSSWGSTSMXWTXRXPXDARXPWGXRXIQXLKDHXPRN^ 
PQ (SEQ ID NO:477); TCSPU)PEVGPYa)TFIMRTLFNIiWlJ\^ 
(SEQ ID NO:478); LWTOLKAESVVLEHRSYCSAKARDRHFAGDXT^ 
NSHGYDVTKVFGSKF (SEQ ID NO:479); REMFEVTGLHDVDCJGWMRAVRK 
HAKGUnWRlJLJ^WTYDDFRNVLDSEDE (SEQ ID NO:480); HFDGFWEVW 

35 NQUJSQKRVGLIHNILTHLAEALHQARLLAIXWPy^^ (SEQ ID 

NO:481); IXjFSU^ITYDYSTAHQPGPNAPI^WVRACVQVU^ 
GST (SEQ ID NO:482). Also preferred are polynucleotide fragments encoding these 
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polypeptide fragments. This gene maps to human chromosome 1 1 , and therefore is 
useful in linkage analysis as a marker for chromosome 1 1 . 

This gene is expressed primarily in human T cells and to a lesser extent in 
human colon carcinoma. 
5 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inmiune disorders and cancer. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 

10 differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the inmiune and gastrointestinal systems, 
expression of this gene at significandy higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 

15 an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 263 as residues: Leu-21 to Ala-30, Ser-38 to Asp-47, Pro-87 to Asp-94, Leu-197 
to Thr-204, Pra-256 to Ser-262, Thr-277 to Arg-282, Thr-293 to Trp-303. 

20 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for the diagnosis and treatment of inmiune 
disorders and gastrointestinal diseases. 



FEATURES OF PROTEIN ENCODED BY GENE NO: 31 
25 The translation product of this gene shares sequence homology with Ribosomal 

protein LI 1 of Caenorhabditis elegans. (See Accession No. 156201.) Preferred 
polypeptide fragments coiiq)rise the amino acid sequence: 
ERGVSINQFCKEFNERTKDIKEGIPLPTKILVKPDRTF^^ 
IEKGAR(3TGKEVAGLVT1JKHVYEIARIKAQDEAFAUJDW 
30 GIRWKDLSSEELAAF QKERAIFLAAQKEADLAAQEEAAKK (SEQ ID NO:483). 
Also preferred are polynucleotide fragments encoding these polypeptide fragments. 

This gene is expressed in human embryo tissue and to a lesser extent in human 
epithelioid sarcoma and other tissues. 

Therefore, polynucleotides and polypeptides of the invention are usefid as 
35 reagents for identification of the tissue(s) or cell type(s) present in a biological sample 
and for diagnosis of diseases and conditions, which include, but are not limited to, 
development disorders and epithelial cell cancer. Similarly, polypeptides and antibodies 
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directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the embryonic and epithelial cell systems, 
expression of this gene at significantly higher or lower levels may be routinely detected 
5 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
1 0 NO: 264 as residues: Lys-34 to Gly-40. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis and treatment of developmental 
disorders and epithelial cancer. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 32 
This gene is expressed primarily in resting T ceUs. 
Therefore, polynucleotides and polypeptides of the invention are usefiil as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

20 not limited to, inflammatory and general immune disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing inmiunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the irrmiune system, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.g., 

25 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, mine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 
The tissue distribution indicates that polynucleotides and polypeptides 

30 corresponding to this gene are useful for the diagnosis and treatment of disorders of 
irrunune system. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 33 

This gene is believed to reside on chromosome 1 . Accordingly, polynucleotides 
35 derived from this gene are useful in linkage analysis as chromosome 1 markers. 

This gene is expressed primarily in prostate and to a lesser extent in soares adult 

brain, human umbilical vein endothelial cells, and amniotic cells. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, prostate-related disorders. Similarly, polypeptides and antibodies 
5 directed to these polypeptides are useful in providing immunological probes for 

differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the urinary system and nervous system 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
10 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that the protein products of this gene are useful 
15 for the diagnosis and treatment of disorders of the urinary and nervous systems. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 34 

This gene shares sequence homology with R05G6.4 gene product. (See Accession No. 

gill 3263380 This gene also shares sequence homology with the cyciophilin-like protein 
20 CyP-60. (See Accession No. 1 199598, see also Biochem. J. 314 (1), 313-319 

(1996).) Preferred polypeptide firagments comprise the amino acid sequence: 

AVYTYHEK3aCDTAASGYGTQNIRI^RDAVKDFIX:CC^ 

YEREAIliYIIJiQKKEIARQMKAYEKQRGmREEQKELQ 

SAIVSRP IJSIPFrAKAI^GTSPDDVQPGPSVGPPSKDKDKVU>SFWIPSL^^ 
25 ATKI^KPSRTVTCPMSGKPUU^DLTPVHFTP^ 

RDSl^NATPCAVlJffSGAVVTlJECVEK^^ 

(SEQ ID NO:484); YLYEREAIIJEYILHQKKEIARQMKAYEKQRGTRREEQKE^ 
RAASQDHVRGFLE (SEQ ID NO:485); and FTAKALSGTSPDDVQPGPSVGPP 
SKDKDKY1J>SFWIPSLTPEAKATKIJEKPSRTVTCPMSGK^ (SEQ ID NO:486). 
30 Also preferred are polynucleotide fi^gmcnts that encode these polypeptide fragments. 

This gene is expressed primarily in human testis and to a lesser extent in other 

tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or ceU type(s) present in a 
35 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, nude reproductive disorders and in particular testicular cancer. Similariy, 
polypeptides and antibodies directed to these polypeptides are usefiil in providing 
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inrniunological probes for differential identification of the tissue(s) or cell type{s). For 
a number of disorders of the above tissues or cells, particularly of the immune system. 
Expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
5 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
10 corresponding to this gene are useful for diagnosis and treatment of disorders of the 
male reproductive system and in particular of testicular cancer. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 35 

The translation product of this gene shares sequence homology with Lpe5p of 

1 5 Saccharomyces cerevisiae which is thought to be important in the metabolism of 
phospholipids. 

This gene is expressed primarily in liver and brain. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not lunited to, metabolic disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type{s). For a number of disorders of the above 
tissues or cells, particularly of the metabolic and nervous systems expression of this 

25 gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken fiom an individual 
having such a disorder, relative to the standard gene expression level, Le., the 
expression level in healthy tissue or bodily fluid from an individual not having the 

30 disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 268 as residues: Pro-14 to Leu-20, Lys-28 to Asn-38, Arg-109 to Arg-1 14, Lys- 
1 19 to Asn-124, Glu-152 to Leu-157, Pro-172 to Val-180. 

Hie tissue distribution and homology to Lpe5p of Saccharomyces cerevisiae 
indicates that polynucleotides and polypeptides corresponding to this gene are useful for 

35 the diagnosis and treatment of metabolic and nervous disorders. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 36 
This gene shares sequence homology with the nuclear ribonucleoprotein U (HNRNP 
U), encoded by C elegans (See Accession gill703576.) Preferred polypeptide 
fragments comprise the amino acid sequence: 
5 MDTSENRPENDVPEPPMPIADQVSNDDRPEGSX^EEKKESSL^ 

SATKGVPAGNSDTEGGQPGRKRRWGASTATTQKKPSISITTESLKSLIPDIKPL 
AGQEAX^LHADDSRISEDETERNGDDGTIIDKGLKiaiTVTQVWAEGQ^ 
REEEEEEKEPEAEPPWPQVSVEVALPPPAEHEVKKVTLGDTLTRRSISQQKSGV 
SITTODPWTAQVPSPPRGKISNIVfflSNLVRPFTLGQLKElXGRTGTL^ 
10 DKIKSHCFVITSTVEEAVATRTALHGVKWPQSNPK^ 

LXODRPSblKliiEQGIPRPIJffPPPPPVQPPQHPRAEQREQERAV^ 

MERRERTRSEREWDRDKVREGPRSRSRSRXRRRKERAKSKEKICSEKKEKAQE 
EPPAmJ)DIJ^RKTKAAPCIYWIJ>LT^^ 

EQKEREKEAERERNRQLEREKRREHSRERDRERERERERDRGDRDRDRERDRE 
15 RGRERDRRDTKRHSRSRSRSTPVRDRGGR (SEQ ID NO:488). Also preferred are 
the polynucleotide fragments encoding this polypeptide fragments. 
This gene is expressed primarily in epididymus. 

Therefore, polynucleotides and polypeptides of the invention are useftil as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of the male reproductive systenou Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing inmiunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the male reproductive system, expression of 

25 this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

30 disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis and treatment of nude 
reproductive disorders. 

35 FEATURES OF PROTEIN ENCODED BY GENE NO: 37 

This gene is expressed primarily in amygdala. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type{s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflanmiatory diseases and reproductive disorders. Similarly, 
5 polypeptides and antibodies directed to these polypeptides are useful in providing 

immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the amygdala, 
expression of this gene at significandy higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
10 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
15 corresponding to this gene are useful for diagnosis and treatment of inflammatory 
diseases and reproductive disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 38 

This gene shares sequence homology with human opsonin protein P35 

20 fragment. (See Accession No. R94 181.) The opsonin protein activates the phagocytosis 
of pathogenic microbes by phagocytic cells. Preferred polypeptide fragments comprise 
the amino acid sequence: GCDSCPPHLPREAFAQDTQAEGECSSRAERADMCPDAP 
PSQEVPEGPGAAP(SEQIDNa489). Also preferred arc polynucleotide fragments 
encoding these polypeptide fragments. 

25 This gene is expressed in inmiune-related tissues such as thymus, macrophage, 

T cells and to a lesser extent in many other tissues. 

Therefore, polynucleotides and polypeptides of the invention arc useful as 
reagents for differcntial identification of the tissue(s) or ceU type(s) present in a 
biological saxapl^ and for diagnosis of diseases and conditions, which include, but arc 

30 not limited to, immune disorders and infectious disease. Similariy, polypeptides and 
antibodies directed to these polypeptides are usefiil in providing inununological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the immune system and infectious disease, 
expression of this gene at significantly higher or lower levels may be routinely detected 

35 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e.. 
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the expression level in healthy tissue or bodily fluid from an individual not haying the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 271 as residues: Lys-9 to Arg-14, Met-38 to Asp-51. 

The tissue distribution indicates that polynucleotides and polypeptides 
5 corresponding to this gene are useful for diagnosis and treatment of immune disorders, 
as well as the treatment and/or diagnosis of infectious disease. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 39 

The translation product of this gene shares sequence homology with alpha-2 
10 type I collagen which is thought to be important in tissue repair. (See, e.g., 21 1607.) 
Preferred polypeptide fragments comprise the amino acid sequence: PQLPSCGRPW 
PGTAS VFQSHTQGPREDPDPCRAQGSAGTHCPISLSPPRQ (SEQ ID NO:490). 
Also preferred are the polynucleotide sequences encoding these polypq)tide sequences. 
This gene is expressed primarily in the brain and to a lesser extent in the kidney 
15 and thymus 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the dssue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, brain, kidney, and inmiune disorders. Similarly, polypeptides and 

20 antibodies directed to these polypeptides are useful in providing inununological probes 
for differential identification of the tissue(s) or ceU type(s). For a number of disorders 
of the above tissues or cells, particularly of the brain, kidney, and immune disorders, 
expression of this gene at significandy higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 

25 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution and homology to alpha-2 type I collagen indicates that 
30 polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 
and treatment of tissue repair, and brain, kidney, inunune disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 40 

The translation product of this gene shares sequence homology with mini- 
35 collagen which is thought to be important in tissue repair tumor metastasis. (See 
Accession No, gnllPIDIdl006976.) Preferred polypeptide fragments comprise the 
ammo acid sequence: PGFRGPSGSLGCSFFPRSLGRVLPPGCQRPGAHAD 
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SSPPPTP (SEQ ID NO:49 1 ). Also preferred are polynucleotides encoding this 
polypeptide fragment 

This gene is expressed in ovarian cancer and to a lesser extent in dedritic cells 
and smooth muscle. 

5 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not Umited to, tumor metastasis and tissue repair. Similarly, polypeptides and 
antibodies directed to these polypeptides are usefiil in providing immunological probes 

1 0 for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the tumor metastasis and tissue repair, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or ceU sample taken from 

15 an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 273 as residues: Asn-2 to ffis-1 1. 

The tissue di^bution and homology to mini-collegen gene indicates that 

20 polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 
and treatment of tumor metastasis and tissue repair. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 41 

This gene shares sequence homology with the HIV TAT proteiiL (See 

25 Accession No. 328416.) Preferred polypeptide fiagments comprise the amino acid 
sequence: EDLKKPDPASIJIAASCGEGKKRKACKNCTCG1j\EEI^ 
SREQMSSQPKSACGNCYLGDAFRCASCPYLGMPAFKPGEKVLLS (SEQ ID 
Na492); EDLKKPDPASLRAASCGEGKKRKAOO^CTCGLAEELEKEK 
SREQMSSQPKSACGNCYLGDAFRCASa>YLGMPAFKPGEKVI^ 

30 (SEQ ID NO:493); CGNCmJDAFRCASCTYUjMPAFKPGEKXOJ^ 

(SEQ ID NO:494); SCGEGKKRKACKNCTCGLAEELEKE (SEQ ID NO:495); 
SQPKSAC GNCYLGDAFRCASC (SEQ ID NO:496); and REAGQNSERQYVS 
LSRD (SEQ ID NO:497). Also preferred are polynucleotide fragments encoding these 
polypeptide fiiagments. 

35 This gene is expressed primarily in the infant brain and to a lesser extent in the 

breast and testes. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, brain, testes and breast disorders. Similarly, polypeptides and antibodies 
5 directed to these polypeptides are useful in providing immunological probes for 

differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the brain, testes and breast disorders, 
expression of this gene at significandy higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum. 

10 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 274 as residues: Pro-7 to Val-15. 

15 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for diagnosis and treatment of brain, testes and 
breast, and other related disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 42 

20 This gene is expressed primarily in the infant brain, human cerebeUum, and to a 

lesser extent in meduUoblastoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

25 not limited to, brain related disorders and meduUoblastoma and other brain cancers. 
Similariy, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a nuniber of disorders of the above tissues or cells, particulariy of the 
brain related disorders and brain cancers, including meduUoblastoma, expression of this 

30 gene at significandy higher or lower levels may be routinely detected in certain tissues 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 

35 disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 275 as residues: Thr-41 to Glu-47. 
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The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of human brain related 
disorders, brain cancers, and medulloblastoma. 



5 FEATURES OF PROTEIN ENCODED BY GENE NO: 43 

The translation product of this gene shares sequence homology with a 
phosphotyrosine-independent ligand for the Ick SH2 domain which is thought to be 
important in signal transduction related to phosphotyrosine-independent ligand for the 
Ick SH2 domain. (See Accession No. gill 184951.) Preferred polypeptide fragments 

10 comprise the amino acid sequence: ESSGQARTLADPGPGWPRQQGMCFGSLT 
GI^TTPHGFLTVSAEADPRLIESI^QMl^MGFSDEGGWLTRIXQTKNYDra^ 
DTIQYSKH (SEQ ID NO:498). Also prefenred are polynucleotide ftagments encoding 
this polypeptide fragment It is likely that this gene is a new member of a family of 
phosphotyrosine-independent ligands for the Ick SH2 domains. 

1 5 This gene is expressed primarily in the placenta and to a lesser extent in 

endothelial cells and neutrophil. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type{s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but arc 

20 not limited to, reproductive, cardiovascular, immune, and infectious diseases. 

Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
cardiovascular, reproductive, and immune system, and infectious diseases, expression 

25 of this gene at significandy higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e,, 
the expression level in healthy tissue or bodily fluid from an individual not having the 

30 disorder. 

The tissue distribution and homology to a phosphotyrosine-independent ligand 
for the Ick SH2 domain indicates that polynucleotides and polypeptides corresponding 
to this gene are usefiil for diagnosis and treatment of cardiovascular, reproductive, and 
immune system diseases, as well as infectious diseases. 

35 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 44 

This gene is expressed primarily in the fetal brain, cerebellum and to a lesser 
extent in the placenta. 

Therefore, polynucleotides and polypeptides of the invention are usefiil as 
5 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neuronal cell related disorders. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 

10 the above tissues or ceUs, particularly of the neuronal cell related disorders, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 

15 the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 277 as residues: Thr-20 to Gly-28. 

The tissue distribution and homology to proline-rich protein genes indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 

20 and treatment of neuronal ceU related disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 45 

The translation product of this gene shares sequence homology with 
precerebellin of human, which is thought to be important in synaptic physiology. (See 

25 Accession No. gil 1 8025 1.) It has been observed that cerebellin-like immunoreacti vity is 
associated with Purkinje cell postsynaptic structures. Thus, it is likely that this gene 
also have synaptic activity. Preferred polypeptide fragments comprise the amino acid 
sequence: QEGSEPVUJEGECXWCEPGRAAAGGPGGAAIXJE/^ 
RSHHHEPAGETGNGTSGAIYFDQ\nLVNEGH3GFDRASGSFV^ 

30 VVKVYNRQTVQVSLMLNTW\aSAFANDPD^^ 

LRRGXSTGW (SEQ ID NO:499). Also preferred are polynucleotide fragments 
encoding these polypeptide fragments. 

This gene is expressed primarily in cerebellum and infant brain. By Northern 
analysis, a single transcript of 2.4 kb was observed in brain tissues. 

35 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
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not limited to, neuronal cell signal transduction and synaptic physiology. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the neuronal cell 
5 signal transduction and synaptic physiology expression of this gene at significantly 
higher or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
10 or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to gene or gene family indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 
and treatment of neuronal cell related disorders. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 46 

This gene is expressed in fetal liver and spleen, and to a lesser extent in bone 
marrow, umbilical vein, and T cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue($) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but arc 
not limited to, disorders of the inmiune system, particularly hematopoiesis. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the hematopoiesis 

25 and immune disorders, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or ceU sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 

30 fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 279 as residues: Asp-30 to Glu-57. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treannent of hematopieotic and 
inmiune disorders. 



35 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 47 

The translation product of this gene shares sequence homology with a 12 kD 
nucleic acid binding protein of Feline calcivirus which is thought to be important in viral 
replication. (See Accession No. 59264) 

This gene is expressed primarily in human cardiomyopathy and to a lesser 
extent in T helper cells, fetal brain and synovial sarcoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cardiomyopathy as well as viral infection. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the cardiovascular system, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 280 as residues: Trp-20 to Cys-26. 

The tissue distribution in cardiomyopathy and homology to viral 12 kD nucleic 
acid binding protein indicates that polynucleotides and polypeptides corresponding to 
this gene are useful for diagnosis and intervention of cardiomyopathy, including those 
caused by ischemic, hypertensive, congenital, valvular, or pericardial abnormalities. 
The gene expression pattern may be the consequence or the cause for these conditions. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 48 

The translation product of this gene shares sequence homology with tumor 

necrosis factor related gene product which is thought to be important in tumor necrosis, 

bacterial and viral infection, immime diseases and inmiunoreacdons. 

This gene is expressed primarily in colon and to a lesser extent in ovarian and 

breast cancers. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, tumors of colon, ovary or breast origins. Similarly, polypeptides and 
antibodies directed to these polypeptides arc useful in providing inununological probes 
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for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the colon, ovary and breast, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
5 urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution and homology to Tumor necrosis factors indicates that 
10 polynucleotides and polypeptides corresponding to this gene are useful for intervention 
of cancers of colon, ovary and breast origins, because TNF family members are known 
to be involved in the tumor development. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 49 

15 The translation product of this gene shares sequence homology with mucins, 

such as epitheUal mucin, which is thought to be important in extracellular matrix 
fimctions such as protection, lubrication and cell adhesion (See for example Accession 
No. R680()2). Preferred polypeptide fragments comprise the following amino acid 
sequence: PRSRPALRPGRQRPPSHSATSGVLRPRKKPDP (SEQ ID NO:500). 

20 Also preferred arc polynucleotide fragments encoding these polypeptide fragments. 
Moreover, this gene maps to chromosome 22ql 1.2-qter, and therefore, can be used as 
a marker in linkage analysis for chromosome 22. 

This gene is expressed primarily in corpus colosum. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

25 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but arc 
not limited to, tumors, especially of corpus colosum, as well as metastatic lesions. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 

30 type(s). For a number of disorders of the above tissues or cells, particularly of the 
corpus colosum and other solid tissues, expression of this gene at significantly higher 
or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 

35 relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid firom an individual not having the disorder. 
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The tissue distribution and homology to mucins indicates that polynucleotides 
and polypeptides corresponding to this gene are useful for serum tumor markers or 
immunotherapy targets because tumor cells have greatly elevated level of mucin 
expression and shed the molecules into the epithelial tissues. 

5 

FEATURES OF PROTEIN ENCODED BY GENE NO: 50 

This gene is expressed priniarily in CD34 depleted buffy coat cord blood and 
primary dendritic cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

10 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, hematopoietic disorders and inomunological disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 

1 5 a number of disorders of the above tissues or cells, particularly of the hematopoietic and 
inmiune systems, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 

20 gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution in CD34 depleted buffy coat cord blood and primary 
dendritic cells indicates that polynucleotides and polypeptides corresponding to this 
gene are useful for diagnosis and treatment of hematopoietic and immune disorxlers. 

25 Secreted or ceO surface proteins in the above tissue distribution often are involved in 
cell activation (e.g. cytokines) or molecules involved in cell surface activation. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 51 

The translation product of this gene shares sequence homology with Interferon 
30 induced 1-8 gene encoded polypeptide which is thought to be important in binding to 
retroviral rev responsive element. Preferred polypeptide fragment comprise the 
following amino acid sequences: MTLITPSXKLTFXKGNKSWSSRACSSTLVDP 
(SEQ ID NO:501). Also preferred are polynucleotide fragments encoding these 
polypeptide fragments. 
35 This gene is expressed primarily in CD34 positive cells and neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
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biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, retroviral infection, such as AIDS, and other immune disorders. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or ceil 
5 type(s). For a number of disorders of the above tissues or cells, particularly of the 
immune system, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 

10 gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 284 as residues: Gln-51 to Trp-62. 

The tissue distribution and homology to interferon induced gene 1-8 indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for 

1 5 intervention of retroviral infection including HIV. The factor may be involved in viral 
stability or viral entry into the cells. Alternatively, the virus/factor complex may elicit 
the cellular immune reaction. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 52 

20 This gene shares sequence homology to immunoglobulin lambda chain (See 

Accession No. 2865484). Therefore it is likely that Uiis gene has activity similar to an 
immunoglobulin lambda chain. Preferred polypeptide fragments comprise the following 
amino acid sequence: GHPSPALSIAPSDGSQLJ<I)EWYGEAHVTRYCKKPLTNS 
HLETEAQSSSL(SEQIDNO:502). Also preferred are polynucleotide fragments 

25 encoding these polypeptide fragments. 

This gene is expressed primarily in Hodgkin's lymphoma. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

30 not limited to, Hodgkin's lymphoma and other immune disorders. Similarly, 

polypeptides and antibodies directed to these polypeptides are usefiil in providing 
inununological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune system, 
expression of this gene at significantiy higher or lower levels may be routinely detected 

35 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e.. 
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the expression level in healthy tissue or bodily fluid from an individual not haying the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 285 as residues: Pro-27 to Thr-32. 

The tissue distribution in Hodgkin's lymphoma and the sequence homology 
5 indicates that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis of Hodgkin's lymphoma, since the elevated expression and secretion by the 
tumor mass may be indicative of tumors of this type. Additionally the gene product may 
be used as a target in the immunotherapy of the cancer.Because the gene is expressed in 
cells of lymphoid origin, the natural gene product may be involved in immune 
10 functions. Therefore it may be also used as an agent for immunological disorders 

including arthritis, asthma, immune deficiency diseases such as AIDS, and leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 53 

This gene has extensive homology to cDNA for Homo sapiens mRNA for the 

1 5 ISLR gene(See Accession No. AB003 1 84). This protein is considered to be a new 
member of the Ig superfamily and contains a leucine-rich repeal (LRR) with conserved 
flanking sequences and a C2-type immunoglobulin (Ig)-like domain. These domains are 
important for protein^rotein interaction or cell adhesion, and therefore it is possible that 
the novel protein ISLR may also interact with other proteins or cells. The ISLR gene 

20 was mapped on human chromosome 15q23-q24 by fluorescence in situ hybridization 
(See Medline Article No. 97468140). Homology to the ISLR gene has been confirmed 
by another independent group as well (See Accession No. Hs. 10217 1) 

This gene is expressed in a number of tissues including human retina, heart, 
skeletal muscle, prostate, ovary, small intestine, thyroid, adrenal cortex, testis, 

25 stomach, spinal cord, fetal lung and fetal kidney tissues, colon, tonsil and stomach 
cancer, and to a lesser extent in endometrial stronud cells treated with estradiol, breast 
tissue, synovium, lymphoma, and number of other tumors. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

30 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, tumors of colon, ovary and breast origins. However, due to the wide 
range of expression in various tissues, protein may play a vital role in the development 
of cancer in other tissues as well, not just those mentioned above. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 

35 immunological probes for differential identification of the tissue{s) or ceU type(s). For 
a number of disorders of the above tissues or cells, particularly of the colon, ovary and 
breast, expression of this gene at significantly higher or lower levels may be routinely 
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detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodUy fluid from an 
5 individual not having the disorder. Additionally, this gene maps to chromosome 15q23- 
q24, and therefore, can be used as a marker in linkage analysis for chromosome 15. 

The tissue distribution in tumors of colon, ovary, and breast origins indicates 
that polynucleotides and polypeptides corresponding to diis gene are useful for 
diagnosis and intervention of these tumors, in addition to other Uimors where 
10 expression has been indicated. Protein, as well as, antibodies directed against the 

protein may show utility as a tumor maricer and/or immunotherapy targets for the above 
listed tumors and tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 54 

15 Gene has homology to multidrug resistance gene 1 (See Accession No. 

P06795). Preferred polynucleotide fragments conq)rise the following sequence: 
GCTTCGTGTCCAACCCrCTTGCCCTTCGCCTGTGTGCCTGGAGCCACT 
CCACGCrCGCGTITCCTCCTGTAGTGCrCACAGGTCCCAGCACCGATGGCA 
TTCCCTirGCXTCTGAGTCTGCAGCGGGTCCCrriTCT 

20 GGTAGCCTCTCTCCCCCTGGGCCACTCCCGGGGGTGAGGGGGTrACCCC^ 
CCCAGTGTITTITATTCCTGTGGGGCTCACCCCAAAGTATrAAAA 
GTAA (SEQ ID NO:503). Also preferred are polypeptide fragments encoded by these 
polynucleotide fragments. 

This gene is expressed primarily in lung, esophagus, leukemia (Jurkat cells) and 

25 breast cancers and to a lesser extent in macrophages treated with GM-CSF fetal tissues 
and wide range of tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for difierential identification of the tissue(s) or cell type{s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 

30 not limited to, cancer of wide range of origins. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the soUd tumors, lung and leukemia, 
expression of this gene at significantly higher or lower levels may be routinely detected 

35 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e.. 
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the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Furthermore, due to the high expression level in lung tissue and the proposed 
function of the multidrug resistance protein i gene as the efflux pump responsible for 
low-drug accumulation in multidnig-resistant cells, protein as well mutants thereof, 
5 may also be beneficial as a target for gene therapy, particularly for the chronic patient 
Preferred epitopes include those comprising a sequence shown in SEQ ID NO: 287 as 
residues: Met-1 to Lys-16. 

The tissue distribution in wide range of cancers and fetal tissues indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for detection of 
10 cells in active proliferation, such as cancers. The gene products may be used for cancer 
markers or immunotherapy target. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 55 
This gene maps to the X chromosome. 

15 This gene is expressed primarily in the brain and to a lesser extent in the 

developing embryo. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

20 not limited to, neurodegenerative disease states and developmental disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
inmiunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders, including sex-linked disorders, of the above tissues or cells, 
particularly of the neurological, developmental systems, and cardiovascular system, 

25 expression of this gene at significantiy higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

30 disorder. Moreover, this gene maps to the X chromosome, and therefore, may be used 
as a marker in linkage analysis for this chromosome. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for die detection/treatment of neurodegenerative 
disease states and behavioral disorders such as Alzheimer's Disease, Parkinson's 

35 Disease, Huntington's Disease, Klinefelter's, schizophrenia, mania, dementia, 

paranoia, obsessive compulsive disorder and panic disorder. In addition, the gene or 
gene product may also play a role in the treatment and/or detection of developmental 
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disorders associated with the developing embryo, sexually-linked disorders, or 
disorders of the cardiovascular system. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 56 
5 The translation product of this gene shares sequence homology with paxillin 

which is thought to be important in mediating signal transduction from growth factor 
receptors to the cytoskeleton. Preferred polynucleotide fragments comprise the 
foUowing sequence: TGKKrrCACrGTCTTACAATCACTGCTGTGGAATCATGA 
TACCACmTAGCTCTrrGCATCTrCCTTCAGTGTATTI^ 

10 AAGTAGATTITAACrGGACAACmTGAGTACTGACATCATTGATAAATAAACr 
GGCTTGTGGTTTCAA (SEQ ID NO:506). Also preferred are polypeptide fragments 
encoded by these polynucleotide fragments. More preferably, polypeptide fragments 
comprise the amino acid sequence: LDELMAHLTEMQAKVAVRAD 
AGKKHLPDKQDHKASUJSMIXjGI^QELQDLGIATVPKGHCASCQK^ 

15 HALGQSWHPEHFVCTHCKEEIGSSPFFERSGLXYCPNDYHQLFSPRCAYCAAP 
IU)KVLTAMNQTWHPEHFFCSHCGEWGAEGFHEKDKKPYCRKDFL^^ 
CGGCNRPVI£NY1^ANIDTVWHPECFVCGDCFTSFS^^ 
HRRGTLCHGCGQPITGRaSAMGYKFHPEHFVCAFCI.TQLSKGIFREQ^ 
CQPCFNKLF (SEQ ID NO:507); KASUaSMLGGLEQELQDLGIATVPKGHC 

20 ASCQKPIAGKVIHAL (SEQ ID NO:508); CPNDYHQLFSPRCAYCAAPILDKVL 
TAMNQTWHPEHFFCSHCGEVFGAEG (SEQ ID NO:509); DKKPYCRKDFLAM 
FSPKCGGCNRP\a.ENYI^AMDTWHPECFVCGDCFrSFSTGSFFE^ 
L (SEQ ID NO:510); CGQPITGRCISAMGYKFHPEHFVCAFCLTQLSKGIFRE 
QNDKTYCQ (SEQ ID NO:51 1). Polynucleotide fragments encoding these preferred 

25 polypeptide fragments arc also contemplated. 

This gene is expressed primarily in brain, and to a lesser extent in the 
developing embryo. 

Therefore, polynucleotides and polypeptides of the invention are useftil as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

30 biological sample and for diagnosis of diseases and conditions, which include, but are 
not Umited to, neurological disease states and developmental abnormalities. Similarly, 
polypeptides and antibodies directed to these polypeptides are usefiil in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the inmiune and 

35 nervous systems, expression of this gene at significantly higher or lower levels may be 
routmely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
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cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Moreover, since this gene shares homology with a 
gene that maps to chromosome 1 1 , (See Accession No.T87404), gene as well as its 
5 translated product may be used for linkage analysis on chromosome 1 1 . 

The tissue distribution and homology to paxillin indicates that polynucleotides 
and polypeptides corresponding to this gene are useful for the treatment and or detection 
of disease states associated with abnormal signal transduction in brain and/or the 
developing embryo. This would include treatment or detection of neurodegenerative 
10 disease states and behavioral disorders such as Alzheimer's Disease, Parkinson's 
Disease, Huntingtons Disease, schizophrenia, mania, dementia, paranoia, obsessive 
compulsive disorder and panic disorder and also in the treatment and or detection of 
embryonic development defects. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 57 

This gene is expressed primarily in fetal spleen, brain, and to a lesser extent in 
six week old embryo. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune disorders, neurological disorders, and developmental 
abnormalities. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for difTerential identification of the ti$sue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 

25 the immune and developmental systems, expression of this gene at significantly higher 
or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, Le., the expression level in healthy tissue 

30 or bodily fluid from an individual not having the disorder. Preferred epitopes include 
those comprising a sequence shown in SEQ ID NO: 290 as residues: Arg-28 to Gly-34. 

The expression of this gene in fetal spleen indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for ueatment/detection of immune 
disorders such as arthritis, asthma, immune deficiency diseases such as AIDS, and 

35 leukemia In addition the expression of this gene in the early embryo, indicates a key 
role in embryo development and hence the gene or gene product could be used in the 
treatment and or detection of embryonic development defects. This would include 
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treatment or detection of neurodegenerative disease states and behavioral disorders such 
as Alzheimer's Disease, Parkinson's Disease, Huntintons Disease, schizophrenia, 
mania, dementia, paranoia, obsessive compulsive disorder and panic disorder and also 
in the treatment and or detection of embiyonic development defects. 

5 

FEATURES OF PROTEIN ENCODED BY GENE NO: 58 
The translation product of this gene shares sequence homology with the gene disrupted 
in the neurodegenerative disease dentatorubal-pallidoluysian atrophy. Moreover a long 
open reading fame exists in an alternative frame. Preferred polypeptide fragments 
10 comprise the following: 

mgssqsveipgggtegyhvlrvqenspghraglepffdhvsingsrlnkdnd 
tlkdjxkxnvekpvkmliyssktlelretsvtpsnlwggqgllgvsirfcsfd 
ganenvwhvlevesnspaalaglrphsdvngadtvmnesedlfslierheakp 
lklyvyntdtdncreviitpnsawggegsuk:gigygylhriptrpfeegkkis 
1 5 lpgqmagtprrplkdgftevqlssvnppslsppgrrgieqsltglsisstppavss 

Vl^GVPTWIXPPQ\^QSLTS\a>PMNPATnJKjI^lPAGLPNLPNIJsrL^ 
PHIMPGVGLPELVNPGLPPLPSMPPRNLPGIAPLPLPSEFLPSFPLVPESSSAASS 
GELLSSIJ>PTSNAPSDPATTrAKADAASSLT\^VrPPTAKAPTTVEDRVGDSTPV 
SEKPVSAAVDANASESP (SEQ ID NO:512); SVEIPGGGTEGYHVLRVQENSPGH 

20 RAGI^FFDFIVSINGSRIJ^KDNDTLK33IXKXNVEKPVKMLIYSSKTLELRETS 
VTPSNLWGGQGLLGVSIRFCSFDGANENVWH (SEQ ID NO:5 13); ESNSPAA 
LAGUO'HSDYnGADTVMNESEDIJSLJETHEAKPLia.YVYNTDTDNCREV^ 
NSAWGGEGSLGCGIGYGYLHRIPTRPFEEGKKISLPGQMAGTPITPLKDGFrEV 
QLSSVNPPSLSPPGTTGIEQSLTG LSISS (SEQ ID NO:514); RIPTRPFEEGKKI 

25 SLPGQMAGTPrrPLKDGFTEVQLSSVNPPSLSPPGTTGIEQSLTGLSISSTPPAVS 
SVLSTGVPTWLIPPQVNQSLTSWPMNPATTLPGLMPLPAGU>NLPNLNLNLP 
APHIMPGVGLPELVNPGLPPLPSMPPRN (SEQ ID NO:516); PGLPPLPSMPPRN 
LPGIAPLPLPSEFLPSFPLVPESSSAASSGELLSSLPPTSNAPSDPATTTAKADAA 
SSLTVDVTPPTAKAPTTVEDRVGDSTPVSEKPVSAAVDAN (SEQ ID NO:517). 

30 This gene is expressed primarily in prostate cancer, and to a lesser extent in the 

pineal glands and in fetal lung. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

35 not limited to, neurological conditions and pulmonary disorders. Similarly, 

polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for dififeroitial identification of the tissue(s) or cell type(s). For 
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a number of disorders of the above tissues or cells, particularly of the nervous^ 
pulmonary, and endocrine systems, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
5 another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 291 as residues: Asn-9 to Leu- 14. 
The abundance of this gene in the pineal gland and its homology to a gene 

10 disrupted in the neurodegenerative disease state Dentatonibral-pallidoluysian atrophy 
indicates that this gene may be useful in the treatment and/or detection of other 
neurodegenerative disease states and behavioral disorders such as Alzheimer's Disease, 
Parkinson's Disease, Huntingtons Disease, schizophrenia, mania, dementia, paranoia, 
obsessive compulsive disorder and panic disorder. The abundance of this gene in fetal 

15 lung would suggest that misregulation of the expression of this protein product in the 
adult could lead to lymphoma or sarcoma formation, particularly in the lung; that it may 
also be involved in predisposition to certain pulmonary defects such as pulmonary 
edema and embolism, bronchitis and cystic fibrosis; and thus the gen or the gene 
protein encoded by the gene could be used in the detection and/or treatment of these 

20 pulmonary disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 59 
This gene is expressed primarily in the developing embryo. 
Therefore, polynucleotides and polypeptides of the invention are useful as 

25 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental abnormalities. Similarly, polypeptides and antibodies 
directed to these polypeptides are usefiil in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 

30 the above tissues or cells, particularly of the developmental system, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 

35 expression level in healthy tissue or bodily fluid fipom an individual not having the 
disorder. 
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The expression of this gene primarily in the embryo, indicates the gene plays a 
key role in embryo development and that the gene or the protein encoded by the gene 
could be used in the treatment and or detection of developmental defects in the embryo 
or in infants. 

5 

FEATURES OF PROTEIN ENCODED BY GENE NO: 60 

This gene displays homology to nestin, an intermediate filament protein, the 
expression of which correlates with the proliferation of Central Nervous System 
progenitor cells and that is useful in the identification of brain tumors. This gene maps 

10 to chromosome 1 , and therefore, may be used as a marker in linkage analysis for 
chromosome 1 (See Accession No. AA527348). 

This gene is expressed primarily in kidney and to a lesser extent in brain. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

1 5 biological sample and for diagnosis of diseases and conditions, which include, but arc 
not limited to, renal disorders and neurodegenerative conditions. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the excretory and 

20 nervous systems, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid fix)m an 

25 individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 293 as residues: Thr-128 to Asn-135. 

The tissue distribution and homology to nestin indicates that polynucleotides 
and polypeptides corresponding to this gene are useful for detection and/or treatment of 
neurodegenerative disease states and behavioral disorders such as Alzheimer's Disease, 

30 Parkinson's Disease, Huntingtons Disease, schizophrenia, mania, dementia, paranoia, 
obsessive compulsive disorder and panic disorder. In addition, its abundance in kidney 
indicates that it is useful in the treatment and detection of acute renal failure and other 
disease states associated with the kidney. 



35 



FEATURES OF PROTEIN ENCODED BY GENE NO: 61 

Gene shares homology with the latrophilin-related protein 1 precursor as well as 
the calciimi-independent alpha-latrotoxin receptor. Preferred polypeptide fragments 
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comprise the following amino acid sequence; 

lYKVFRHTAGLKPEVSCFENmSCARXXXXXXXXXXXXWIFGVLHV^ 
TAYLFTVSNAFQGMFIFLFLCVl^mQEEYYRLFK^ (SEQ ID NO:518); 
WG\a.HWHASVWAYUTVSNAFQGMFIFIJF^ 
5 C (SEQ ID NO:519). Also preferred are polynucleotide fragments encoding these 
polypeptide fragments. (See Accession No. 2213659) The translation product of this 
gene shares sequence homology with CD 97, a seven transmembrane bound receptor. 
This gene is expressed primarily in infant brain and in endothelial cells. 
Therefore, polynucleotides and polypeptides of the invention are usefid as 

10 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurological disorders and hematopoeitic disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides are usefiil in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 

15 a number of disorders of the above tissues or cells, particularly of the neurological and 
hematopoeitic systems, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 

20 standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 294 as residues; Lys-13 to Leu-21. 

The tissue distribution of this gene suggest that it may be useful in the detection 
and/or treatment of neurodegenerative disease states and behavioral disorders such as 

25 Alzheimer's Disease, Parkinson's I>isease, Huntingtons Disease, schizophrenia, mania, 
dementia, paranoia, obsessive compulsive disorder and panic disorder, while its 
expression in hen>atopoietic cell types indicates that the gene could be important for the 
treatment or detection of immime or hematopoietic disorders including arthritis, asthma 
and immunodeficiency diseases. 

30 

FEATURES OF PROTEIN ENCODED BY GENE NO: 62 
This gene is expressed primarily in fetal liver and fetal spleen. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
35 biological sample and for diagnosis of diseases and conditions, which mclude, but are 
not limited to, hematological and inmaunological disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are usefid in providing inmiunological probes 
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for differential identification of the tissue(s) or cell typc(s). For a number of disorders 
of the above tissues or cells, particularly of the immune and hemalopoetic systems, 
expression of this gene at significanUy higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
5 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes mclude those comprising a sequence shown in SEQ ID 
NO: 295 as residues: Ser-91 to Lys-98. 
10 The tissue distribution of this gene fetal liver and spleen indicates that the gene 

could be important for the treatment or detection of immune or hematopoietic disorders 
including arthritis, leukemia, asthma and inomunodeficiency diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 63 
1 5 Gene shares homology with human serum amyloid protein. Preferred polypeptide 
fragments comprise the foUowing amino acid sequence: 

ALTRIPPGDWVINWAVSFAGKTTARFFHSSPPSLGDQARTDPGHQRRD (SEQ 
ID NO:520) (See Accession No. W13671). Also preferred are polynucleotide 
fragments encoding these polypeptide fragments This gene maps to chromosome 9, and 

20 therefore, may be used as a marker m linkage analysis for chromosome 9 (See 
Accession No. AA004342). 

This gene is expressed primarily in fetal liver and spleen. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

25 biological saiiq)le and for diagnosis of diseases and conditions, which include, but are 
not limited to, hematopoietic and immune disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the hematopoietic and immune systems, 

30 expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken fi-om 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

35 disorder. 
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The tissue distribution of this gene in fetal liver-spleen indicates that the gene 
could be important for the treatment or detection of immune or hematopoietic disorders 
including arthritis, leukemia, asthma, and immunodeficiency diseases, 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 64 

This gene maps to chromosome 3, and therefore, may be used as a marker in 
linkage analysis for chromosome 3 (See Accession No. AA2 19669). 
This gene is expressed specifically in the brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
10 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurodegenerative disease states. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
1 5 the above tissues or cells, particularly of the neurological systems, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an mdividual 
having such a disorder, relative to the standard gene expression level, i.e., the 
20 expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are usefiil for the detection/treatment of neurodegenerative 
disease states and behavioral disorders such as Alzheimer's Disease, Parkinson's 
25 Disease, Huntintons Disease, schizophrenia, mania, dementia, paranoia, obsessive 
compulsive disorder and panic disorder. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 65 

Gene shares homology with a yeast protein. Preferred polypeptide fragments 
30 comprise the following amino acid sequence: LQEVNTTLPENSVWYERYKFDIP 

VFHL (SEQ ID NO:521). Also preferred are polynucleotide fragments encoding these 
polypeptide fragments. (See Accession No. 1332638) 

This gene is expressed primarily in fetal tissue (fetus and fetal liver). 
Therefore, polynucleotides and polypeptides of the invention arc useful as 
35 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, liver disorders and cancers (e.g. hepatoblastoma). Similarly, 
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polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the hepatic system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
5 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an mdividual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 

10 NO: 298 as residues: Asn-59 to Glu-64. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the detection and treatinent of Uver disorders 
and cancers (e.g. hepatoblastoma, jaundice, hepatitis, liver metabolic diseases and 
conditions that are attributable to the differentiation of hepatocyte progenitor cells). In 

1 5 addition the expression in fetus would suggest a useful role for the protein product in 
developmental abnormalities, fetal deficiencies, pre-natal disorders and various would- 
healing models and/or tissue trauma. 



FEATURES OF PROTEIN ENCODED BY GENE NO: 66 

20 Gene has homology with a B-cell surface antigen which may indicate gene plays 

a role in the immune response, including, but not limited to disorders and infections of 
the immune system. Preferred polynucleotide fragments comprise the following 
sequence: TAGCATGTAGCCAGTQjAATAACmATAAGGACAAAGTGGAGTC 
CACGCGTGCGGCCGTCTAGACrAGTGGATCCCCCGGCTGCAGGATTCGGC 

25 ACGAG (SEQ ID NO:523). Also preferred are polypeptide fragments encoded by 
these polynucleotide fragments (See Accession No.T94535). Additionally, this gene 
shares homology with an interferon-gamma receptor. Preferred polypq)tide fragments 
also comprise the foDowing amino acid sequence: M(y}SGSQFRAaXCLCFSCPC 
SPGGPRWNSR(XSGRRFPKTCRAISQNLWKYKTFCPVRYMQPHRSSI^ 

30 YVI^TWGSLRTYSTDLKKKiaa^SRGGPW (SEQ ID NO:522); 

MQGSGSQFRAClLCLCFSa>CSPGGPRWNSR(y}GRRFPKTC^ 
(SEQ ID NO:524); PVRYMQPHRSSUXHFTSYVFILSTWGSLRTYSTDLKKK^ 
NSRGGPVPIRPKS (SEQ ID NO:525); and GEEQRDCSLGWRGVGMRATHCQAA 
RMFVLFSLPKYAGL (SEQ ID NO:526). Also preferred are polynucleotide fragments 

35 encoding these polypeptide fragments 

This gene is expressed primarily in T-cells and gall bladder. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type{s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immunological disorders and conditions (inununodeficiencies, cancer, 
5 leukemia, hematopoeisis). Similarly, polypeptides and antibodies directed to these 
polypeptides are usefiil in providing inraiunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the inmiune and digestive systems, expression of this gene at 
significantiy higher or lower levels may be routinely detected in certain tissues (e.g., 

10 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 299 as residues: 

15 Thr-41 toGly-52. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the treatment and diagnosis of immune 
disorders including: leukemias, lymphomas, auto-immune disorders, inmiuno- 
supressive (transplantation) and immunodeficiencies (e.g. AIDS), inflanmiation and 

20 hematopoeitic disorders. The expression of this gene in gall bladder would suggest a 
possible role for this gene product in digestive disorders, particularly of the pancreas. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 67 

This gene maps to chromosome 1 1, and therefore, may be used as a marker in 
25 linkage analysis for chromosome 1 1 (See Accession No. AAOl 1622). 

This gene is expressed primarily in a variety of fetal and developmental tissues 
(e.g. fetal spleen, infant brain). 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
30 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental, immune or neurological abnormalities. Similarly, 
polypeptides and antibodies directed to these polypeptides arc useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the developing 
35 immune and central nervous systems, expression of this gene at significanfly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
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another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 300 as residues: Ser-38 to Ser-43. 
5 The tissue distribution indicates that polynucleotides and polypeptides 

conesponding to this gene are useful for developmental abnormalities or fetal 
deficiencies. The detection in infant brain would suggest a role in neurological disorders 
(both developmental and neurodegenerative conditions of the brain and nervous system, 
behavioral disorders, depression, schizophrenia, Alzheimer's disease, Parkinson's 
10 disease, Huntington's disease, mania, dementia). In addition, the detection in spleen 
would similarly suggest a role in detection and treatment of inununologically mediated 
disorders (e.g. immunodeficiency, inflanmiation, cancer, wound healing, tissue repair, 
hematopoeisis). 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 68 

This gene is expressed primarily in spleen, T-cells, and fetal heart. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

20 not limited to, immunological deficiencies, including AIDSand cardiovascular 

disorders. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a nimiber of disorders of the above tissues or cells, particularly of 
the inmiune and cardiovascular systems, expression of this gene at significantly higher 

25 or lower levels may be routinely detected in certain tissues (e.g., cancerous and 

wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or ceD sample taken fh)m an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 

30 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for the diagnosis and treatment of immune 
disorders including: leukemias, lymphomas, autoinmiune disorders, 
inamunodeficiencies (e.g. AIDS), inmiuno-suppressive conditions (transplantation) and 
hematopoeitic disorders. The expression in fetal heart indicates that polynucleotides and 

35 polypeptides corresponding to this gene are useful for the treatment and diagnosis of 
cadiovascular disorders (e.g. heart disease, restenosis, atherosclerosis, stoke, angina, 
thrombosis). 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 69 

Gene shares homology with a human collagen protein. Preferred polypeptide 
fragments comprise the following amino acid sequence: 
5 MPRKTSKCRQLLCSGASRNADTAARQSTCSSHRPPGKIPSLGPRRXPGCXSVP 
SSRGEQSTGSPAAPRCGRRDAPIRGIJ>GGAAMTPGDWASFNPRAGHSKS(^ 
GQESSGASRQDRHPVSHWVERQREAWGAPRSSSAGGVKVAATTEREPEFKIK 
TGKA (SEQ ID NO:527); CSGASRNADTAARQSTCSSHRPPGKIPSLGPRRXPG 
CXSVPSSRGEQSTGSPAAPRCGRRDAHRGLPGGAAMTPGDTWASFNPRAGHS 
10 (SEQ ID NO:528); QGEGQESSGASRQDRHPVSHWVERQREAWGAPRSSSAGG 
VKVAATTEREPEFKIKTGKA (SEQ ID NO:529) (See Accession No. 124886). Also 
preferred are polynucleotide fragments encoding these polypeptide fragments 
This gene is expressed primarily in fetal heart. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

15 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cardiovascular disorders. Similarly, polypeptides and antibodies directed 
to these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 

20 tissues or cells, particularly of the cardiovascular system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 

25 in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 302 as residues: 
Pro-32 to Ser-39. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the treatment and diagnosis of cadiovascular 

30 disorders (e.g. heart disease, restenosis, atherosclerosis, stroke, angina, thrombosis). 

FEATURES OF PROTEIN ENCODED BY GENE NO: 70 

The translation product of this gene shares sequence homology with a chicken 
single-strand DNA-bmding protein. Preferred polypeptide fragments comprise the 
35 following amino acid sequence: 

MSPRYPGGPRPPLRIPNQALXjGVPGSQPIXPSGMDPTRQQGHPNMGGPMQ 
TPPRGMWLGPQNYGGAMRPPLNALGGPGMPGMNMGPGGGRPWPNPTNAN 
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SIPYSSASPGNYVGPPGGGGPPGTPIMPSPADSTNSGDNMYTLNINAVPPGPNR 
PNFPMGPGSDGPMGGLGGMESHHMNGSLGSGDNIPSISKNSPNNMSLSNQP 
GTPRDDGEMGGNFLNPFQSESYSPSMTMSV (SEQ ED NO:530); MSPRYPGG 
PRPPLRJPNQAUjGVPGSQPIXPSGMDPTRQQGHPNMGGPMQRNTTPPRGMVP 
5 LGPQNYGGAMRPPUsfALGGPGMPGMNMGPGGGRPWPNPTNANSIPYSSASP 
GNY (SEQ ID. NO:531); LNALGGPGMPGMNMGPGGGRPWPNPTNANSIPYSS 
ASPGNYVGPPGGGGPPGTPIMPSPADSTNSGDNMYTLMNAVPPGPN (SEQ ID 
NO:532); GPMGGLGGMESHHMNGSLGSGDMDSISKNSPNNMSLSNQPGTPR 
DDGEMGGNFLNPFQSESYSPSMTMSV (SEQ ID NO:533); TCEHSSEAKAFHDY 

10 (SEQ ID NO:534). Also preferred are polynucleotide fragments encoding these 
polypeptide fragments. (See Accession No. 1562534) 

This gene is expressed primarily in placenta and to a lesser extent in the fetal 
heart and a variety of other tissues and cell types. 

Therefore, polynucleotides and polypeptides of the invention arc useful as 

15 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but arc 
not limited to, developmental abnormalities, fetal deficiencies, and particularly of the 
cardiovascular systeoL Sunilarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 

20 of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the reproductive system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken firom an individual having such a disorder, relative to 

25 the standard gene expression level, Le., the expression level in healthy tissue or bodily 
fluid fipom an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the detection and treatment of developmental 
abnormahties or fetal defidendes, ovarian and other endometrial cancers, reproductive 

30 dysfunction, cardiovascular disorders, and pre-natal disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 71 

This gene is expressed primarily in fetal liver and to a lesser extent in the breast 
and testes. 

35 Therefore, polynucleotides and polypeptides of the invention arc useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
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not limited to, liver disorders (including hepatoblastomas) and reproductive disorders. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
5 hepatic and reproductive systems, expression of this gene at significandy higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 

10 fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for detection and treatment of liver disorders and 
cancers (e.g. hepatoblastoma, jaundice, hepatitis, liver metabolic diseases and 
conditions that are attributable to the differentiation of hepatocyte progenitor cells). The 

15 expression in testes and breast indicates that polynucleotides and polypeptides 

corresponding to this gene are usefid for the detection and treatment of endocrine and 
reproductive disorders (e.g. sperm maturation, milk production, testicular and breast 
cancers). 

20 FEATURES OF PROTEIN ENCODED BY GENE NO: 72 

This gene maps to chromosome 1, and therefore, may be used as a marker in 
linkage analysis for chromosome 1 (See Accession No. W93595). 

This gene is expressed primarily in snaooth muscle and to a lesser extent in 

brain. 

25 Therefore, polynucleotides and polypeptides of the invention are usefid as 

reagents for differential identiflcation of the tissue(s) or cell type(s) present in a 
biological saiiq[>le and for diagnosis of diseases and conditions, which include, but are 
not limited to, cardiovascular and neurological disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are usefid in providing immunological probes 

30 for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the cardiovascular and central nervous 
systems, expression of this gene at significandy higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 

35 taken from an individual having such a disorder, relative to the standard gene 

expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 
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The tissue distribution indicates that polynucleotides and polypeptides . 
corresponding to this gene are useful for the detection and treatment of restenosis, 
atherosclerosis, stroke, angina, thrombosis, wound healing and other conditions of 
heart disease. In addition, the expression in brain would suggest that polynucleotides 
5 and polypeptides corresponding to this gene are useful for the detection and treatment of 
developmental, degenerative and behavioral conditions of the brain and nervous system 
(e.g. schizophrenia, depression, Alzheimer's disease, Parkinson's disease, 
Himtington's disease, mania, dementia, paranoia, addictive behavior and sleep 
disorders). 

10 

FEATURES OF PROTEIN ENCODED BY GENE NO: 73 

Gene shares homology with human stromalin-2. Preferred polypeptide 
fragments comprise the following amino acid sequence: 
QAFVIXSDIXLIFSPQMIVGGRDFLJIPLVFFPEATLQSELASFL^ 

15 GSGA (SEQ ID NO:535); ACSYIiCNPEFTFFSRADFARSQLVDLLTDRFQQE 
LEELLQVG (SEQ ID NO:536),QKQLSSLRDRMVAFCELCQSCLSDVDTEIQEQV 
ST (SEQ ID NO:537); QVILPALTLVYFSDLWTLTHISKSDAS (SEQ ID NO:538); 
STHDLTRWELYEPCCQLLQKAVDTGXVPHQV{SEQIDNO:539). Also preferred 
are polynucleotide fragments encoding these polypeptide fragments (See Accession 

20 NO.R65208 ) This gene maps to chromosome 7, and therefore, may be used as a 
marker in linkage analysis for chromosome 7 (See Accession No. D52585). 

This gene is expressed primarily in the brain (infant brain, adult brain, pituitary, 
cerebellum, hippocampus, schizophrenic hypothalmus, amygdala). 

Therefore, polynucleotides and polypeptides of the invention arc useful as 

25 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental and neurodegenerative diseases of the brain and nervous 
system. Similarly, polypeptides and antibodies directed to these polypeptides are useful 
in providing inmiunological probes f<^ differential identification of the tissue(s) or ceU 

30 type(s). For a number of disorders of the above tissues or cells, particularly of the 
central nervous system, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 

35 standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
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comprising a sequence shown in SEQ ID NO: 306 as residues: Thr-25 to Lys-36, Lys- 
55 to Ser-63. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for detection and treatment of developmental, 
5 degenerative and behavioral conditions of the brain and nervous system (e.g. 

schizophrenia, depression, Alzheimer's disease, Parkinson's disease, Huntington's 
disease, mania, dementia, paranoia, addictive behavior and sleep disorders). 

FEATURES OF PROTEIN ENCODED BY GENE NO: 74 
10 This gene is expressed primarily in the hypothalamus of a human suffering from 

schizophrenia. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

15 not limited to, disorders of the CNS particularly schizophrenia Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the CNS, such as schizophrenia 
expression of this gene at significantly higher or lower levels may be routinely detected 

20 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 

25 NO: 307 as residues: Gly-38 to AIa-44. 

The tissue distribution indicates that the protein products of this gene are useful 
for the study, diagnosis and treatment of schizophrenia and other disorders involving 
the CNS. 

30 FEATURES OF PROTEIN ENCODED BY GENE NO: 75 

Preferred polypeptides of the invention comprise the following amino acid 
sequence encoded by this gene: 

LAVSTSHCCADISTALPLGSSRPAPAPRHREHEHGHQARPPRLLXTSLMPLSTP 
AAAQLLWTQLTPMGGRPGGRHSPPTLHTGPRALPPGPPHPSLHVAAI^LLR 
35 (SEQ ID NO:540). Polynucleotides encoding such polypeptides are also provided. 

This gene is expressed primarily in endometrial tumor and to a lesser extent in 
amniotic cells. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, reproductive and immune disorders particularly cancers of those 
5 systems. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing inamunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the reproductive and inmiune systems, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues {e.g., cancerous and wounded 

10 tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 308 as residues: Ser-3 to Arg-9. 

15 The tissue distribution indicates that the protein products of this gene are useful 

for study and treatment of immune and reproductive disorders particularly cancers of 
those systems. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 76 

20 This gene is expressed primarily in kidney cortex and to a lesser extent in early 

stage human brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

25 not limited to, renal disorders such as renal cancer. Similarly, polypeptides and 

antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the kidney expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 

30 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken firam an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid bom an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 309 as residues: 

35 Gly-38 to Gly-45, Gly-47 to Gly-52, Pro-92 to Lys-1 10. 

The tissue distribution indicates that the protein products of this gene are useful 
for study, treatment and diagnosis of renal diseases such as cancer of the kidney. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 77 
This gene is expressed primarily in kidney medulla. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, metabolic and renal disorders. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 

10 the above tissues or cells, particularly of the metabolic and renal systems, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken fix)m an 
individual having such a disorder, relative to the standard gene expression level, i.e., 

15 the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that the protein products of this gene are useful 
for study, treatment and diagnosis of metabolic and renal diseases and disorders. 

20 FEATURES OF PROTEIN ENCODED BY GENE NO: 78 

This gene is expressed in chronic synovitis and microvascular endothelium. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

25 not limited to, arthritis and atherosclerosis. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the vascular and skeletal systems, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 

30 tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

35 The tissue distribution indicates that the protein products of this gene are useful 

for study, diagnosis and treatment of arthritic and other inflammatory diseases as well 
as cardiovascular diseases. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 79 

This gene is expressed in resting T-cells and activated monocytes. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
10 tissues or cells, particularly of the immune system, expression of this gene at 

significantiy higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
15 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that the protein products of this gene are useful 
for the study and treatment of immune diseases such as inflammatory conditions. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 80 

20 This gene is expressed in a variety of inmiune system tissues, e.g., neutrophils, 

T-cells, and TNF induced epithelial and endothelial cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

25 not limited to, infectious and irmnune disorders. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the inunime and vascular systems, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 

30 tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spmal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 

35 NO: 3 1 3 as residues: Met- 1 to Trp-6. 

The tissue distribution indicates that the protein products of this gene are useful 
for study and treatment of infectious diseases, immune and vascular disorders. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 81 

This gene is expressed in activated neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammation and other immune conditions. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing inmiunological probes 
for differential identification of the tissue(s) or cell lype(s). For a number of disorders 
10 of the above tissues or cells, particularly of the immune system, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
15 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that the protein products of this gene are useful 
for study and treatment of immune disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 82 

20 This gene is expressed in neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammatory and other urnnune conditions. Similarly, polypeptides and 

25 antibodies directed to these polypeptides are useful in providing immimological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the immune system, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 

30 fluid or spinal fluid) or another tissue or cell sample taken from an mdividual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid firom an individual not having flie disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 315 as residues: 
Ala-83toThr-91. 

35 The tissue distribution indicates that the protein products of this gene are useful 

for study and treatment of immune disorders. 



wo 98/549d3 



PCT/US98/n422 



65 

FEATURES OF PROTEIN ENCODED BY GENE NO: 83 
This gene is expressed in human neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not Umited to, inflammation and immune disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing inmiunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 

10 of the above tissues or cells, particularly of the immune and inflanrniatory system, 

expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 

1 5 the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that the protein products of this gene are useful 
for diagnosis and treatment of disorders of the inflammatory and inmiune systems. 

20 FEATURES OF PROTEIN ENCODED BY GENE NO: 84 

This gene is expressed in human neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or ceU type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

25 not limited to, disorders of the inflanrniatory and immune systems. Similarly^ 

polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the inflammatory and 
immime systems, expression of this gene at significantly higher or lower levels may be 

30 routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sanq)le taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

35 The tissue distribution indicates that the protein products of this gene are useful 

for diagnosis and treatment of disorders of the immune and inflammatory systems. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 85 

This gene is expressed in activated neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell type($) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammation and immune system diseases. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing inununological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 

10 of the above tissues or cells, particularly of the inmiune system and inflannmatory 

system, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
semm, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken fix)m an individual having such a disorder, relative to the standard gene 

15 expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that the protein products of this gene are useful 
for diagnosis and treatment of diseases of the inflammatory and inmiune systems. 

20 FEATURES OF PROTEIN ENCODED BY GENE NO: 86 

This gene is expressed in activated neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue($) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

25 not limited to, inflanunation and inmiune system disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immimological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the inflammatory and immune system, 
expression of this gene at significantly higher or lower levels may be routinely detected 

30 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 

35 NO: 319 as residues: Met-1 to Gly-6, Gly-32 to Pra43, Leu-55 to Ghi-60. 

The tissue distribution indicates that the protein products of this gene are useful 
for diagnosis and treatment of disorders of the immune and inflammatory system. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 87 

In specific embodiments, polypeptides of the invention comprise the sequence: 
EQVIJ\LLWPRFELILEMNVQSVRSTDPQRIXK3 
5 QTIPNERTMQUjGQI^VEVENFVIJIVAAEFSSRK^ 

RAADDSKEVESFQQIiNARTQEFIEEU^PPFGGLVAFVKEAEAUERGQAER^ 
GEEARVTQLIRGFGSSWKSSVESLSQDVMRSFTNFRNGTSnQG (SEQ ID 
NO:541 ),AIXKYRFFYQFLLGNERATAKEmDEYVETLSKIYl^ YYRS 
VQYEEVAEKDDLMGVEDTAKKGFXSKPSRSRNTIFTLGTRGS\aSPTEL^ 
10 PHTAQR (SEQ ID NO: 542); EQRYPFEAUTlSQHYXLLDNSCREYLnCEFFVVS 
GPXAHDLFHA\^GRTLSMTLKHIJDSYLAIX7m 

VPALDRYW (SEQ ID NO:543XGGLDTRPHYITRRYAEFSSALVSINQ (SEQ ID 

NO:544); SRKEQLVFLINNYDMMLGVL (SEQ ID NO: 545) and/or ALLKYRFFY 

QFLLGNERATAKEIRDEYVETL^KIYIJSYYRSYI^ 
15 X^DTAKKGFXSKPSUISRNTIFTLGTRGSVISI^LEAPILVPHTAQRXEQR^^ 

EALJTISQHYXLLDNSCREYLHCEFFWSGPXAHDLJ^VM 

SYlj\DCYDAIAVFLCIHIVlJlFRf^^ 

NVQSVRSTDPQRIXjGLiyna'HYrrRR^ 

EVENFNOJIVAAEFSSRKEQLVFLINNYDMMI^ 
20 ARTQEFIEELLSPPFGGLVAFVKEAEAIJERGQAERIJRGEEARVTQU^ 

KSSVESl^QDVMRSFTNFRNGTS (SEQ ID NO:546). Polynucleotides encoding 

these polypeptides are also encompassed by the invention. The translation pnxluct of 

this gene shares sequence homology with suppressor of actin mutation which is thought 

to be important in mutation suppression. 
25 This gene is expressed primarily in fetal liver and to a lesser extent in a variety 

of other tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

30 not limited to, liver and mutations. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing inmiunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the liver or cancer, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 

35 cancerous and wounded tissues) or bodily fluids (e.g.. serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or ceU sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
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in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 320 as residues: 
Val-53 to Arg-60, Thr-88 to Thr-94, Ala-142 to Ser-150, Gly-188 to Glu-196, Gly- 
208 to Ser-214, Thr-227 to Gly-232, Lys-279 to Phe-285. 
5 The tissue distribution and homology to suppressor of actin mutation suggest 

that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and of liver disorder or cancer. 



FEATURES OF PROTEIN ENCODED BY GENE NO: 88 

10 This gene maps to chromosome 9, and therefore can be used in linkage analysis 

as a marker for chromosome 9. In specific embodiments, polypeptides of the invention 
comprise the sequence: 

YEGKEFDYVFSmwnEGGPSYKLPYNTSDDPWLTAYNFLQK^ 
KFITONTKGQMIXjLGNPSFSDPFTGGGRYVPGSSGSSNTLP^ 

15 PGSASMGTTMAGVDPFTGNSAYRSAASKTMNIYFPK^ 
LKEU^GTAPEEKKLTEDDUU^KII^UCNSSSEKFIYC^ 
FPALDILRl^IKHPSVNENFCNEKEGAQFSSHUNI^ 
NCFVGQAGQKL1S«MSQRESLMSHAIELKSGSNKNI (SEQ ID NO: 547); 
HIALATLAIJsnrSVCTHKD (SEQ ID Na 548); HNEGKAC^CIJSLISTILEVVQ 

20 DLEATFRLLVALGTUSDDSNAVQLAKS (SEQ ID NO:549); LGVDSQIKKYSS 

VSEPAKVSECCRFILNLL (SEQ ID NO:550); and/or YEGKEFDYVFSIDVNEGGPS 
YKLPYNTSDDPWLTAYNFUJKNDLNPMFLDQVAKFm 
DPFTGGGRYVPGSSGSSNTLPTADPFTGAGRYVPGSASMGTriV^ 
SAYRSAASKTMNIYFPKKEAVTFIX^ANPTQ 

25 LLEKILSUCNSSSEKPTVQQLQILWKAINCPEDIVFPAL^ 
NEKEGAQFSSHIJNUJ«>KGKPANQLLALRTFCNC^ 
MSHAIELKSGSNKNIHIALATLAlJ^r^SVa^K^ 
I^ATFRLLVALGTTUSDDSNAVQIJVKSIjGVDSQIKKYSSVSEP^^ 
LL (SEQ ID NO:551). Polynucleotides encoding these polypeptides are also 

30 encompassed by the invention. These polypeptides share significant homology with 
phospholipase A2 activating protein which is thought to be important in signal 
transduction (see, e.g., Wang et al.. Gene 161(2):237-241 (1995)). 

This gene is expressed primarily in endothelial cells, to a less extent in placenta, 
endometrial stromal cells, osteosarcoma, testis tumor, muscle, and infant brain that arc 

35 likely to be rich in blood vessles. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
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biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, disorders in vascular system, abenent angiogenesis, tumor angiogenesis. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
5 type(s). For a number of disorders of the above tissues or cells, particularly of the 
vascular system or tumors, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 

10 the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution of this gene in endothelial cells and several potential 
highly vascularized tissues and its homology to phosphohpase A2 activating protein 
suggest that this gene may be involved in transducing signals for endothelial cells in 

1 5 angiogenesis or vasculogenesis. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 89 

In specific embodiments, polypeptides of the invention comprise the sequence: 
YPNQDGDIUUXJVIJffiHIQRI^KVVTANHRAI^ 
20 AYKTPRDKVQCIIJIM(3TIMN1J^LA>^ 

STVQYISSFYASCLSGEESYWWMQFTAAVE (SEQ ID NO:552); YPNQDGDILR 

DQVLHEHIQW^KWrANHRALQIPEVYIJlEAP\mA(JSEIRTO 

CIUIMCSTIMNIXSLANEDSVPGADDFW 

SCLSGEESYWWMQFTAAVEFIKTI (SEQ ID NO:553); YPNQIXjDILRDQVL (SEQ 
25 ID NO:554); EAPAVPSACJSEI (SEQ ID NO:555); PVLVFVUKANP (SEQ ID 
NO:560); SGEESYWWMQFTAAVEFIKTI (SEQ ID NO:556); ADDFVPVLVF 
VUKANPP (SEQ ID NO:557); YKTPRDKV(2CIL (SEQ ID NO:558); and/or 
GADDFVPVLVFVUK (SEQ ID Na559). The translation product of this gene shares 
sequence homology with human ras inhibitor and yeast VPS9p which is thought to be 
30 important in golgi vacuole transport 

This gene is expressed primarily in T cells and melanocytes and to a lesser 
extent in a variety of other tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
35 biological sample and for diagnosis of diseases and conditions, which include, but are 
not Umited to, dysfunction and disorders involving T cells and melanocytes. Similarly, 
polypeptides and antibodies directed to these polypeptides are tiseful in providing 



wo 98/54963 



PCT/US98/11422 



70 



immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
5 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution and homology to ras inhibitor indicates that 
10 polynucleotides and polypeptides corresponding to this gene are useful for regulating 
signal transduction; diagnosis and treatment of disorders involving T cells and 
melanocytes. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 90 

15 This gene maps to chromosome 9 and therefore polypeptides of the invention 

can be used in linkage analysis as a marker for chromosome 9. The Uranslation product 
of this gene shares sequence homology with neuronal olfactomedin-related ER localized 
protein which is thought to be important in influence the maintenance, growth, or 
differentiation of chemosensoiy cilia on the apical dendrites of olfactcMry neurons. In 

20 specific embodiments, polypeptides of the invention comprise the sequence: 

SARASTQPPAGQHPGPC (SEQ ID NO:561); MPGRWRWQRDMHPARKLLSLL 
FLILMGTELTQD (SEQ ID NO:562); SAAPDSLLRSSKGSTRGSL (SEQ ID 
NO:563); AAIVIWRGKSESRIAKTPGI (SEQ ID NO:564); FRGGGTLVLPPTHT 
PEWUL (SEQ ID NO:567); PLGITLPLGAPETGGGD (SEQ ID NO:565); and/or 

25 CAAETWKGSQRAGQLCALLA (SEQ ID NO:566). 
This gene is expressed in pineal gland. 

Therefore, polynucleotides and polypeptides of the invention are usefril as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

30 not limited to, neurological and endocrinological disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing inmiunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the neurological or endocrine systems, 
expression of this gene at significanfly higher or lower levels may be routinely detected 

35 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e.. 
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the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 323 as residues: Leu-20 to Ala-26, Arg-32 to Arg-39, Thr-i04 to Gly-1 12. 

The tissue distribution and homology to olfactomedin-related protein indicates 
5 that polynucleotides and polypeptides corresponding to this gene are useful for 

maintenance, growth, or differentiation of neuron cells in pineal gland, therefore, may 
be useful for diagnosis and treatment of neurological disorders in pineal gland. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 91 

10 This gene is expressed primarily in prostate and apoptotic T cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, prostate disease and T ceU dysfunction. Similarly, polypeptides and 

1 5 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the prostate cancer, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodUy fluids (e.g., serum, plasma, urine, synovial 

20 fluid or spinal fluid) or another tissue or cell san^le taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for detect abnormal activity in prostate and T cells 

25 or probably treatment of this abnormality. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 92 

This gene is expressed primarily in prostate and to a lesser extent in smooth 
muscle cells, fibroblasts, and placenta. 

30 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, disorders in prostate or vascular system. Similarly, polypeptides and 
antibodies directed to these polypeptides are usefid in providing immunological probes 

35 for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the prosate or vascular system, expression 
of this gene at significanUy higher or lower levels may be routinely detected in certain 
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tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
5 disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for regulating function of prostate or highly 
vascularized tissues, e.g. placenta. 

10 FEATURES OF PROTEIN ENCODED BY GENE NO: 93 

This gene is expressed primarily in embryos and fetal tissues stage human and 
to a lesser extent in a wide variety of other proliferative tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

15 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, disorders in embryonic development and cell proliferation. Similarly, 
polypeptides and antibodies directed to these polypeptides are usefid in providing 
immunological probes for differential identification of the dssue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the embryonic tissues 

20 and proliferative cells, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken finom an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 

25 fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis or treatment of abnormalities in 
developing and proliferative cells and organs. 

30 FEATURES OF PROTEIN ENCODED BY GENE NO: 94 

The translation product of this gene shares sequence homology with 
transformation related protein which is thought to be important in transformation. 

This gene is expressed primarily in female reproductive tissues, i.e., breast 
cancer cells, placenta, and ovary and to a lesser extent in fetal lung. 
35 Therefore, polynucleotides and polypeptides of die invention are usefid as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
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not limited to, cancer or dysfunction of reproductive tissues, Sinrularly, polypeptides 
and antibodies directed to these polypeptides are useful in providing inmiunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the reproduction system, 
5 expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken fix)m 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

10 disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 327 as residues: Ser-50 to Pro-61. 

The tissue distribution and homology to transformation related protein indicates 
that polynucleotides and polypeptides coixesponding to this gene are useful for 
diagnosis and treatment of conditions caused by transformation, i.e. tumorigenesis in 

15 reproductive organs, e.g. breast, placenta, and ovary. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 95 

This gene is expressed primarily in testes, rhabdomyosarcoma, infant brain and 
to a lesser extent in some tumors and highly vascularized tissues. 

20 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, tumorigenesis, abnormal angiogenesis, and/or neurological disorders. , 
Similarly, polypeptides and antibodies directed to these polypeptides are usefiil in 

25 providing immunological probes for differendal identification of the tissue(s) or cell 
type(s)- For a number of disorders of the above tissues or cells, particularly of the 
tumor tissues or vascular tissues, expression of this gene at significantiy higher or 
lower levels may be routinely detected in certain tissues (e.g„ cancerous and wounded 
tissues) or bodily fluids (e.g., senun, plasma, urine, synovial fluid or spinal fluid) or 

30 another tissue or cell sample taken firom an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodUy 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 328 as residues: Arg-46 to Trp-54, Pro- 
60 to ne-69, Asn-1 16 to Ala-122, Arg-147 to Lys-153, Ser-158 to Glu-170, ne-399 to 

35 Ser-405, Pro-486 to Met-499, Pro-502 to Asp-508. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for a range of disease states including treatment of 
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tumor or vascular disorders and the treatment of neurological disord^s such as 
Alzheimer's Disease, Parkinson*s Disease, Huntingtons Disease, schizophrenia, mania, 
dementia, paranoia, obsessive compulsive disorder and panic disorder. 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 96 

This gene maps to chromosome 7 and therefore polynucleotides of the present 
invention can be used in linkage analysis as a marker for chromosome 7. The 
translation product of this gene is homologous to the Clostridium perfringens 
enterotoxin (CPE) receptor gene product and shares sequence homology with a human 

10 ORF specific to prostate and a glycoprotein specific to oligodendrocytes both of which 
are tissue specific proteins.(See e.g., Katahira et al., J Cell Biol. 136(6): 1239-1247 
(1997). PMID: 9087440; UI: 97242441. 

This gene is expressed primarily in pancreas tumor and ulcerative colitis and to a 
lesser extent in several tumors and normal tissues. 

15 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, pancreatic disorder, ulcerative colitis, tumors and food poisoning. 
Similarly, polypeptides and antibodies directed to these polyp^tides arc useful in 

20 providing immunological probes for diffimntial identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
digestive system or tumorigenic system, expression of this gene at significantly higher 
or lower levels may be routinely detected in certain tissues (e.g., cancercHis and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 

25 fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid fit>m an individual not having the disorder. Preferred epitopes include 
those comprising a sequence shown in SEQ ID NO: 329 as residues: Gly-147 to Met- 
152,Cys-177toLys-188. 

30 The tissue distribution and homology to prostate and oligodendrocyte-specific 

protein indicates that polynucleotides and polypeptides corresponding to this gene are 
useful for marker of diagnosis or treatment of disorder in pancreas, ulcerative colitis, 
and tumors. Furthermore, identity to the human receptor for Clostridium perfiingenes 
entertoxin indicates that the soluble portion of this receptor could be used in the 

35 treatment of food poisoning associated with Qostridia perfringens by blocking the 
activity of perfringens enterotoxin. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 97 

The translation product of this gene shares sequence homology with ATPase 
which is thought to be important in metabolism. 
5 This gene is expressed primarily in testes and several hematopoietic cells and to 

a lesser extent in other tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

10 not limited to, leukemia and hematopoietic disorders. Similarly, polypeptides and 

antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the hematopoietic system, expression of this 
gene at significandy higher or lower levels may be routinely detected in certain tissues 

15 (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 

20 NO: 330 as residues: Leu-37 to Ala-42. 

The tissue distribution and homology to ATPase indicates that polynucleotides 
and polypeptides corresponding to this gene are useful for marker of diagnosis and 
treatment of leukemia and other hematopoietic disorders. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 98 

In specific embodiments, polypeptides of the invention comprise the sequence: 
MRSARPSLGCLPSWAFSQALNI (SEQ ID NO:568); LLGLKGLAPAEISAVCE 
KGNFN (SEQ ID NO:569); VAHGLAWSYYIGYLRULPELQARIR (SEQ ID 
NO:570); TYNQHYNNLLRGAVSQRC (SEQ ID NO:571); HiPLDCGVPDNLSM 

30 ADPNIRFLDKLP(JC!TGDRAGIKDRVYSN (SEQ ID NO:572); SIYELLENGQRAGT 
CVLEYATPLQTLFAMSQYSQAGFSGEDRLEQ (SEQ ID NO:573); AKLFCRTLE 
DILADAPESQNNCRUAYQEPADDSSFSLSQEVLRHLRQEEKEEVT^ 
PSTSTMSQEPELUSGMEKPLPLRTDFS (SEQ ID NO:574); and/or LLGLKGLA 
PAmS AVCEKGNFNVAHGIJVWSYYIGYLRLILPEL (SEQ ID NO:575). 

35 Polynucleotides encoding these polypeptides are also encompassed by the invention. 

This gene is expressed primarily in prostate BPH and to a lesser extent in bone 
marrow. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, benign prostatic hypertrophy or prostate cancer. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing inununological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the male urinary system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 331 as residues: Ile-60 to Asn-69, Leu-106 to Asp-1 12, Glu-130 to Gly-136, Phe- 
160 to Glu-167, Pro-184 to Cys-190, Glu-197 to Ser-202, Arg-215 to Glu-221, Thr- 
237 to Pro-242. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis or treatment of benign prostatic 
hypertrophy or prostate cancer. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 99 

This gene is expressed primarily in salivary gland. 

Therefore, polynucleotides and polypeptides of the invention arc useful as 
reagents for differential identification of die tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, disorders or injuries of the salivary gland. Similariy, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of glandular tissues, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken firom an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid fix)m an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides: 
corresponding to this gene are useful for treatment of disorders of, or injuries to the 
salivary gland or other glandular tissue. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 100 

This gene maps to chromosome 15, accordmgly, polynucleotides of the 
invention can be used in linkage analysis as a marker for chromosome 15. The 
5 translation product of this gene shares sequence homology with a Celegans gene of 
unknown function. In specific embodiments, polypeptides of the invention comprise 
the sequence: DPRVRLNSLTCKHIHSLTQ (SEQ ID NO:583); TMKLLKLRRNIV 
KLSLYRHFTN (SEQ ID NO:576); TLBLAVAASIVFIIWTTMKFRI (SEQ ID 
NO:577); \TCQSDWRELWVDDAIWRLLFSMILFVI (SEQ ID NO:578); MVLWR 

10 PSANNQRFAFSPLSEEEEEDEQ (SEQ ID NO:580); KEPMLKESFEGMKMRS 

TKQEPNGNSKVNKAQEDDL (SEQ ID NO:584); and/or KWVEENVPSSVTDVALP 
ALLDSDEERMTTHFERSKME (SEQ ID NO:582). Polynucleotides encoding these 
polypeptides are also encompassed by the invention. 

This gene is expressed primarily in thyroid and to a lesser extent in 

1 5 osteoclastoma, kidney medulla, and lung. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, thyroid dysfunction or cancer. Similarly, polypeptides and antibodies 

20 directed to these polypeptides are useful in providing inmiunological probes for 

differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the endocrine system, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 

25 fluid or spinal fluid) or another tissue or cell sample taken fix)m an individual having 
such a disorder, relative to the standard gene expression level, Le., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 333 as residues: 
Lys-IQ? to Leu- 124, Glu-150 to Thr-159, Pro-173 to Asp-179, Ser-192 to Ser-201. 

30 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for diagnosis and treatment of thyroid dysfunction 
or cancer. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 101 
35 This gene maps to chromosome 1 6, therefore polynucleotides of the invention 

can be used in Unkage analysis as a marker for chromosome 16. In specific 
embodiments, polypeptides of the invention comprise the sequence: 
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IRHELTVLRDTRPACA (SEQ ID NO:585); and/or MDFXMALIYD (SEQ ID 
NO: 586). Polynucleotides encoding these polypeptides are also encompassed by the 
invention. 

This gene is expressed primarily in kidney cortex and to a lesser extent in adult 
5 brain, corpus colosimi, hippocampus, and frontal cortex. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurological disorders. Similarly, polypeptides and antibodies directed to 
10 these polypeptides are useful in providing inununological probes for differential 

identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the central nervous system, expression of this gene at 
significandy higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
15 fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment or diagnosis of neurological 
20 disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 102 

In specific embodiments, polypeptides of the invention comprise tte sequence: 
MQENIMRNQDRALSNLESIPGGYNA (SEQ ID NO:587); LRRMYTDIQEPMLSA 

25 AQEQF GGNPF (SEQ ID NO:588); ASLVSNTSSGEGSQPSRTENRDPLPNPWAP 
QT (SEQ ID NO:589); SQSSSASSGTASTVGGTTGSTASGTSGQSTTAPNLVPGV 
GASMFNTPG MCJSLUX^ITENPQIJ^QNMLSAPY (SEQ ID NO:590); 
MRSMMQSLSQl^DIAAQMMUsnSfPlPAGI^ (SEQ ID 

Na591); MQm^imSAMSNPKAMQALLJQK^ 

30 ALGSTGGSSGTNGSNATPSENTSPTAGT (SEQ ID NO:592); TEPGHCJQH 
(X^MUJALAGVNPQUJNPEVRF(3(5QI£QL^AMGFL^^ 
lERLLGSQPS (SEQ ID NO:593); RNPAMMQEMMRNQDRALSNLESDKjGY 
NALRRMYTDIQEPMLSAA (SEQ ID NO:594); GNPFASLVSNTSS (SEQ ID 
NO:595); ENRDPLPNPWA (SEQ ID NO:595); GKILKDQDTLSQHGIHD (SEQ ID 

35 NO:597); GLTVHLVIKTQNRP (SEQ ID NO:598); SELQSQMQRQLLSNPEMM 
(SEQ ID NO:599); PmSHMLNNPDIMR (SEQ ID NO:600); and/or 
RQLIMANPQMC)QUQRNP (SEQ ID NO:601). Polynucleotides encoding these 
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polypeptides are also encompassed by the invention. 
This gene is expressed primarily in breast 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
5 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, breast cancer. Similarly, polypeptides and antibodies diiected to these 
polypeptides are useful in providing inmiunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of tumor systems, expression of this gene at significantly higher or lower 

10 levels may be routinely detected in certain tissues (e.g., cancerous and wounded 

tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid fix)m an individual not having the disorder. 

15 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for treatment and diagnosis of some types of 
breast cancer. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 103 

20 The translation product of this gene shares sequence homology with secreted 

serine proteases and lysozyme C precursor, which is thought to be important in 
bacteriolytic function. In specific embodiments, polypeptides of the invention comprise 
the sequence: NLCHVDCQDLLNPNLLAGIHCAKRIVS (SEQ E) NO:602); 
UXjFEGYSI^DWLCLAFVESKFN (SEQ ID NO:603); 

25 NENADGSFDYGLFQINSHYWCN (SEQ ID NO:604); and/or 

NLCHVDCQDLLJIPNLLAGIHCAKRIVS (SEQ ID NO:605). Polynucleotides 
encoding these polypeptides are also encompassed by the invention. 
This gene is expressed primarily in testes. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
30 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not Umited to, infection. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing inmiunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
35 particularly of the immune system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
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another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 336 as residues: Ile-62 to Phe-70, Asn- 
5 78 to Asn-84. 

The tissue distribution and homology to lysozyme C precursor indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for boosting the 
moncyte-macrophage system and enhance the activity of inununoagents. 

10 FEATURES OF PROTEIN ENCODED BY GENE NO: 104 
This gene is expressed primarily in apoptotic T-ceU. 
Therefore, polynucleotides and polypeptides of the invention are usefiil as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

1 5 not limited to, immune disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s), Foranumberof disorders of the above 
tissues or cells, particularly of the immune system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 

20 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken ftom an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid firom an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

25 corresponding to this gene are useful for treatment and diagnosis of some immune 
disorders. 



FEATURES OF PROTEIN ENCODED BY GENE NO: 105 

The translation product of this gene shares sequence homology with ARI 

30 protein of Drosophila (accession 2058299; EMBL: locus DMARIADNE, accession 

X98309), which is thought to be important in axonal path-finding in the central nervous 
system. In specific embodiments, polypeptides of the invention comprise the sequence 
IREVNEVIQNPAT (SEQ ID NO:606); mUIXSHFNWDKEKLMERYF 
DGNLEKLFA (SEQ ID NO:607); NTRSSAQDMPCQICYLNYPNSYF (SEQ ID 

35 NO:608); TGI^CGHKFCMCJCWSEYLTrKIMEEGMGQTISCPA^ (SEQ ID 
NO:614); CDILVDDNTVMRLITDSKVKLKYQHLTmSFVECNRI^^ 
CHHWKVQYPDAKPV (SEQ ID NO:609); CDILYDDNTYMRLTTDSK 
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VKLKYQHLITNSE^a4RLLKWCPAPIX:HHWKV (SEQ ID NO:^ 
Gam^VaWQNCKAEFCWV(XGPWEPHGSAAVYNCmY^^ 
RSRAALQRYL (SEQ ID NO:61 1); FYCNRYMNHMQSLRFEHKLYAQVKQ 
KMEEMQQHNMSWIEVQFLKKA\nD\aX:(y:RAT^ (SEQ ID NO: 612); 
5 YWAFYLKKlWQSIIFE^WQADl£NATEVI^GYIi 

RYCESR (SEQ ID NO:613) Polynucleotides encoding these polypeptides are also 
encompassed by the invention. 

This gene is expressed primarily in adult brain, and to a lesser extent in 
endometrial tumor, melanocytes, and infant brain. 

10 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases or injuries involving axonal path development. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 

15 immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the central nervous 
system, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum» plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 

20 taken from an individual having such a disorder, relative to the standard gene 

expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution and homology to ARI protein indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for treatment of 

25 disease states or injuries involving axonal path development, including 
neurodegenerative diseases and nerve injury. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 106 

The translation product of this gene shares sequence homology with cytochron^ 
30 b561 [Sus scrofa] which is thought to be an integral membrane protein of 

neuroendocrine storage vesicles of neurotransmitters and peptide hormones. 

This gene is expressed primarily in frontal cortex and to a lesser extent in 

rhabdomyosarcoma. 

Therefore, polynucleotides and polypeptides of the invention are usefid as 
35 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 

not limited to, neurological disorders. Similarly, polypeptides and antibodies directed to 
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these polypeptides arc useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the central nervous system, expression of this gene at 
significanUy higher or lower levels may be routinely detected in certain tissues (e.g., 
5 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 339 as residues: 
10 Ser-18 toPro-24. 

The tissue distribution and homology to cytochrome b561 [Sus scrofaj indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for 
treatment and diagnosis of neurological disorders. This gene may also be important in 
regulation of some types of cancers. 

15 

FEATURES OF PROTEIN ENCODED BY GENE NO: 107 

In specific embodiments, polypeptides of the invention comprise the sequence: 

MWGYLFVDAAWNFLGCUCGW (SEQ ID NO:615); MKHSSGNVSAIRSSILLL 

RXSLSYIJ3NCLRVSAIFVYFLLFLLLS (SEQ ID NO:616); and/or MDQALRGSPSE 
20 GFSTDPSPPQVGRQIPSFPPWRRLVLPKASGCFLEREWWLCVFK^ 

HAYNSSELGGRGKGIT (SEQ ID NO:617). Polynucleotides encoding these 

polypeptides are also encompassed by the invention. 

This gene is expressed primarily in pancreas tumor and to a lesser extent in 

cerebelluHL 

25 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, pancreatic tumors. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 

30 identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the endocrine system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spmal fluid) or another tissue or cell sample taken ftom an individual having 

35 such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
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epitopes include those comprising a sequence shown in SEQ ID NO: 340 as residues: 
Pro-22 to Phe-33. 

The tissue distribution indicates that polynucleotides and polypeptides 
conesponding to this gene are useful for diagnosis and treatment of pancreatic tumors. 

5 

FEATURES OF PROTEIN ENCODED BY GENE NO: 108 

This gene maps to chromosome 17 and therefore polynucleotides of the 
invention can be used in linkage analysis as a marker for chromosome 17. In specific 
embodiments, polypeptides of the invention comprise the sequence: 

10 MIPAI^SCCHFSPPEQAARLKKU2EQEK(X^KVEFRKRMEKEVS 

KXFQPMNKffiRSILHDVVEVAGLTSFSFGEDDDCRYVMIFKKEF^^ 
RRGEEWDPQKAEEKRNXKELAQRQ (SEQ ID NO:618); EEEAAQQGPWV 
SPASDYKDKYSHUGKGAAKDAAHMLQANKTYGCXPVANKRDTRSffiE^^ 
IRAKKRLRQSGE (SEQ ID NO:619); PPRRPAQLPLTPGAGQGAGRDKAAAIRA 

15 HPGAPPLNHLLP (SEQ IDNO:620); AVPQAGGKQVFDLSPLELGYVRGMCVCV 
(SEQ ID NO:621) and/or MLPAIJVSCCHFSPPEQAAia.KKLQEQEKQQKVEFRK 
RMEKEVSDHQDSGQIKKKFQPMNKIEI^IUroVVEVAGL^^ 
MIFKKEFAPSDEELJ>SYRRGEEWDPQKAEEKRNXKELAQRQEEEA^ 
SPASDYKDKYSHUGKGAAKDAAHMLQANIOTGCXPVANKRI^^ 

20 IRAKKRLRQSGE (SEQ ID NO:622). Polynucleotides encoding these polypeptides 
are also encompassed by the invention. The translation product of this gene shares 
sequence homology with FSA-1 which may play a role as a structural protein 
component of the acrosome. 

This gene is expressed primarily in fetal kidney and sperm. 

25 Therefore, polynucleotides and polypeptides of the invention arc useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, male reproductive disorders, especially involving acrosomal disfunction. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 

30 providing immunological probes for differential identification of the tissue(s) or cell 

type(s). For a nmnber of disorders of the above tissues or cells, particularly of the male 
reproductive system, expression of this gene at significantly higher or lower levels may 
be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 

35 cell sample taken from an individual having such a disorder, relative to the standard 

gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
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individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 341 as residues: Glu-8 to Asn-35. 

The tissue distribution and homology to FSA-1 indicates that polynucleotides 
and polypeptides corresponding to this gene are useful for treatment of infertility due to 
5 acrosomal disfunction of sperm. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 109 

This gene is expressed primarily in pituitary and to a lesser extent in 
epididymus. 

10 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, male reproductive disorders. Similarly, polypeptides and antibodies 
directed to these polypeptides arc usefiil in providing immunological probes for 

15 differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the male reproductive system, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 

20 individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 342 as residues: Met-1 to Trp-6. 

Because the gene is found in both pituitary and epididymus, this indicates that 

25 polynucleotides and polypeptides corresponding to this gene are useful for treatment 
and diagnosis of male reproductive disorders. This may involve a secreted peptide 
produced in the pituitary targeting the epididymus. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 110 

30 In specific embodiments, polypeptides of the invention comprise the sequence: 

LLCPVLNSGXSWNFPHPSQPEYSFHGFHSTRLWI (SEQ ID NO:623); and/or 
PSTPWFLFLLGLTCPFSTSHPRWDSIPP (SEQ ID NO:624). Polynucleotides 
encoding these polypeptides are also encompassed by the invention. 
This gene is expressed primarily in resting T-cells. . 

35 Therefore, polynucleotides and polypeptides of the invention are useftil as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
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not limited to, T-cell disorders. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune system, expression of this gene at significantly higher or 
5 lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 
1 0 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for treatment and diagnosis of certain immune 
disorders, especially those involving T-cells. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 111 

1 5 This gene is expressed primarily in cerebellum and whole brain and to a lesser 

extent in infant brain and fetal kidney. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

20 not limited to, neurological disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or ceUs, particularly of the central nervous system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 

25 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ED NO: 344 as residues: 

30 Asp-48 to Gly-55. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of neurological 
disorders. 



35 



FEATURES OF PROTEIN ENCODED BY GENE NO: 112 

The translation product of this gene shares sequence homology with yeast 
mitochondrial ribosomal protein homologous to ribosomal protein si 5 of E.coli which 
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is thought to be important in the eariy assembly of ribosomes (See Accession No. 
M3801 6). This gene maps to chromosome 1 , and therefore, may be used as a marker 
in linkage analysis for chromosome 1. 

This gene is expressed primarily in developmental tissues. 
5 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not hmited to, development of cancers and tumors in addition to healing wounds. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 

10 providing inmiunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
iiiunune and developmental expression of this gene at significandy higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 

15 another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution and homology to ribosomalprotein sl5 of E. coii 
indicates that polynucleotides and polypeptides corresponding to this gene are useful for 

20 diseases related to the assembly of ribosomes in the mitochondria which is important in 
the translation of RNA into protein. Therefore, this indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for diagnosis and intervention of 
multiple tumors as well as in healing wounds which are thought to be under similar 
regulation as developmental tissues. Protein, as well as, antibodies directed against the 

25 protein have utility as tumor markers, in addition to immunotherapy targets, for the 
above listed tumors and tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 113 

The translation product of this gene shares sequence homology with human 
30 poliovirus receptor precursors which are thought to be important in viral binding and 

uptake. Preferred polypeptide fragments comprise the following amino acid sequence: 

EI^ISISlWAIJU)EGEYTCSIFIWWTAKSL\rr^^ 

ATLNCQSSGSKPAARLTWRKGDQElJ^GEP^UQEDPNGKTFTVSSS^^^ 

EDDGASIVCSVNHESLKGADRSTSQRmVLYTPrAlVffi«>DPPHPR 
35 EGRGNPVPQQYLWEKEGSWPLKMTQESALIFPFLNKSDSGTYGCT 

YKAYYTLNVND(SEQIDNO:625). Also preferred are polynucleotide fragments 

encoding these polypeptide fragments (See Accession No. gnllPIDId 1002627). 
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This gene is expressed almost exclusively in human brain tissue. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
5 not limited to, susceptibility to viral disease and diseases of the CNS especially cancers 
of that system. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the central nervous system, expression of this gene at significantly higher or lower 

10 levels may be routinely detected in certain tissues (e.g., cancerous and wounded 

tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 

15 comprising a sequence shown in SEQ ID NO: 346 as residues: Leu-26 to Asp-37, Lys- 
53 to Ser-59. 

The tissue distribution and homology to poliovims receptor precursors indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for the 
treatment and prevention of diseases that involve the binding and uptake of virus 
20 particles for infection. It might also be helpful in genetic therapy where the goal is to 
insert foreign DNA into infected cells. With the help of this protein, the binding and 
uptake of this foreign DNA might be aided. In addition, it is expected that over 
expression of this gene will indicate abnormalities involving the CNS, particularly 
cancers of that system. 

25 

FEATURES OF PROTEIN ENCODED BY GENE NO: 114 

The translation product of this gene shares sequence homology with 
Y087_CAEEL hypothetical 28.5 KD protein ZK1236.7 in chromosome HI of 
Caenortiabditis elegans in addition to alpha- 1 collagen type HI (See Accession No. 

30 gil537432). One embodiment for this gene is the polypeptide fragment(s) comprising 
the following amino acid sequence: VPELPDRVHQLHQAVQGCALGRPGFPGGPTH 
SGHHKSHPGPAGGDYNRCDRPGQVHLHNPRGTGRRGQLHPTAGPGVHRRA 
CPSQQLPHRLGPGVPCPSPSLTPVLPSWTQSWCG LPGYTSSS (SEQ ID 
NO:630). An additional embodiment is the polynucleotide fragment(s) encoding these 

35 polypeptide fragments 

This gene is expressed primarily in brain cells and to a lesser extent in activated 
B and T cells. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurodegeneration and imunological disorders. Similarly, polypeptides 
5 and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the neural and immune systems, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 

10 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 347 as residues: Glu-34 to Glu-39, Gly-51 to Ser.72, Ala-88 to Glu-93, Gln-100 

15 toVal-105. 

The tissue distribution and homology to Y087_CAEEL hypothetical 28.5 KD 
protein ZK1236.7 in chromosome IH of Caenorhabditis elegans as well as to a 
conserved alpha- 1 collagen type HI protein indicates that polynucleotides and 
polypeptides correspondmg to this gene are usefiil for the detection and treatment of 

20 neurodegenerative disease states and behavioral disorders such as Alzheimer's Disease, 
Parkinson's Disease, Huntingtons' Disease, schizophrenia^ mania, dementia, paranoia, 
obsessive compulsive disorder and panic disorders. Because the gene is expressed in 
cells of lymphoid origin, the natural gene product may be involved in immune 
functions. Therefore it may be also used as an agent for inununological disorders 

25 including arthritis, asthma, immune deficiency diseases such as AIDS, and leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 115 

The translation product of this gene shares sequence homology with alpha 3 
type DC collagen which is thought to be in^rtant in hyaline cartilage formation via its 

30 ability to uptake inorganic sulfate by cells (See Accession No. gil975657). One 

embodiment of this gene is the polypeptide fragment comprising the following amino 
acid sequence: SLRRPRSAAXQTLTTFLSSVSSASSSALPGSREPCDPRAPPPPR 
SGSAASCCSCCCSCPRRRAPLRSPRGSKRRIRQREVVDLYNGMCLQGPAGVPG 
RDGSPGANGIPGTPGIPGRDGFKGEKGECXRESFEESWTPNYKQCSW 

35 GroLGKIAECTFTKMRSNSALRVLFSGSLRLKCRNACCQRWYF^ 
LPffiAHYLDQGSPEMNSTINIHRTSSVEGLCEGIGAGLVDVAIWVGTa 
DASTGWNS VSRIEEELPK (SEQ ID NO:634). An additional embodiment are the 
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polynucleotide fragments encoding this polypeptide fragmenL 

This gene is expressed primarily in smooth muscle and to a lesser extent in 
synovial tissue. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, dwarfism, spinal deformation, and specific joint abnormalities as well as 
chondrodysplasias i.e., spondyloepiphyseal dysplasia congenita, familial osteoarthritis, 
Atelosteogenesis type II, metaphyseal chondrodysplasia type Schmid and autoimmune 

10 disorders . Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the skeletal system^ expression of this gene at significantiy higher or lower levels may 
be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 

15 fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution and tK)mology to alpha 3 type IX collagen indicates that 

20 polynucleotides and polypeptides corresponding to this gene are useful for the treatment 
and diagnosis of diseases associated with the mutation in this gene which leads to the 
many different types of chondrodysplasias. By the use of this product, the abnormal 
growth and development of bones of the limbs and spine could be routinely detected or 
treated in utero since the protein or muteins thereof could affect epithelial cells eariy in 

25 development and later the chondrocytes of the developing craniofacial structure. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 116 

The translation product of this gene shares sequence homology with retrovirus^ 

related reverse transcriptase which is thought to be important in viral replicatioiL One 
30 embodiment for this gene is the polypeptide fragments comprising the following amino 

add sequence: TKKENCRPASIMNTOTKIIJ^^ An 

additional embodiment is the polynucleotide fragments encoding these polypeptide 

fragments (See Accession No. pirlA25313IGNHULl). 

This gene is expressed primarily in human meningima. 
35 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 



wo 98/54963 



PCT/US98/11422 



90 



not limited to, retroviral diseases such as AIDS, and possibly certain cancers due to 
transactivation of latent cell division genes. Similarly, polypeptides and antibodies 
directed to these polypeptides are usefid in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
5 the above tissues or cells, particularly of the immune system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 

10 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to retrovinis-related reverse transcriptase 
indicates that polynucleotides and polypeptides corresponding to this gene are useful for 
the detection and treatment of diseases and maladies associated with retroviral infection 
since a functional reverse transcriptase (RT) or RT-like molecule is an integral 

1 5 component of the retroviral life cycle. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 117 

The translation product of this gene shares sequence homology with an 
unknown gene from C elegans, as well as weak homolog with mammalian metaxin, a 

20 gene contiguous to both thrombospondin 3 and glucocerebrosidase, is known to be 
required for embryonic development Preferred polypeptide fragments comprise the 
following amino acid sequence: MCNLPIKWCRANAEYMSPSGKVPXXHVGNQ 
WSEIJ3PIVQFVKAKGHSI^EK3I£EVQKAEMKAYMELV^^^ 
DEATVGXITHXRYGSPYPWPIJaniJ\YQKQWEVKRK>a^ 

25 DVD(yx:QAl^QRLGTQPYFFNKQPimj)ALWGHLYm 

YSNLLAFCRRI EQHYFEDRGKGRLS (SEQ ID NO:641); MCNLPIKWCRANAE 
YMSPSGKVPXXHVGNQ WSELGPIVQFVK (SEQ ID NO:642),. Also preferred are 
polynucleotide fragments encoding these polypeptide fragments (See Accession No. 
gill326108). 

30 This gene is expressed primarily in fetal tissues and to a lesser extent in 

hematopoietic cells and tissues, including spleen, monocytes, and T cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

35 not limited to, cancer, lymphoproliferative disorders; inflammation; chondrosarcoma, 
and Gaucher disease. Similarly, polypeptides and antibodies directed to these 
polypeptides are usefiil in providing immunological probes for differential identification 
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of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the hematopoietic and embryonic systems, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
5 fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis and treatment of cancer and other 

10 proliferative disorders. Expression in embryonic tissue and other cellular sources 

marked by proliferating cells indicates that this protein may play a role in the regulation 
or cellular division. Additionally, the expression in hematopoietic cells and tissues 
indicates that this protein may play a role in the proliferation, differentiation, and 
survival of hematopoietic cell lineages. Thus, this gene may be useful in the treatment 

1 5 of ly mphoproliferati ve disorders, and in the maintenance and differentiation of various 
hematopoietic lineages from early hematopoietic stem and committed progenitor cells. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 118 

The translation product of this gene shares sequence homology with reverse 

20 transcriptase which is important in the synthesis of a cDN A chain from an RNA 
molecule, and is a method whereby the infecting RNA chains of retroviruses are 
transcribed into their DNA complements. One embodiment for this gene is the 
polypeptide fragment comprising the following amino acid sequence: 
MXXXNSHmFTIjmfGlJSfAPNEW^^ 

25 KIKGWRIOYQANGKQKK (SEQ ID NO:647). An additional embodiment is the 
polynucleotide fragments comprising polynucleotides encoding these polypeptide 
fragments (See Accession No. gil2072964). 

This gene is expressed primarily in skin and to a lesser extent in neutrophils. 
Therefore, polynucleotides and polypeptides of the invention are usefril as 

30 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, cancer, hematopoietic disorders; inflammation; disorders of immime 
surveillance. Similarly, polypeptides and antibodies directed to these polypeptides are 
usefiil in providing immunological probes for differential identification of the tissue(s) 

35 or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the epidermis and/or hematopoietic system, expression of this gene at significantly 
higher or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
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wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 
5 The tissue distribution and homology to reverse transcriptase indicates that 

polynucleotides and polypeptides corresponding to this gene are useful for cancer 
therapy. Expression in the skin also indicates that this gene is useful in wound healing 
and fibrosis. Expression by neutrophils also indicates that this gene product plays a role 
in inflammation and the control of immune surveillance (i.e. recognition of viral 
10 pathogens). Reverse transcriptase family members are also useful in the detection and 
treatment of AIDS. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 119 

The translation product of this gene shares sequence homology with reverse 
15 transcriptase which is important in the synthesis of a cDNA copy of an RNA molecule, 
and is a method whereby a retrovirus reverse-transcribes its genome into an inheritable 
DNA copy. 

This gene is expressed primarily in the frontal cortex of brain. 
Therefore, polynucleotides and polypeptides of the invention are useful as 

20 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancer and neurodegenerative disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 

25 of the above tissues or ceUs, particularly of the CNS and peripheral nervous system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, Le., 

30 the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution and homology to reverse transcriptase suggest that this is 
useful in the treatment of cancer and AIDS. The expression in brain indicates that it 
plays a role in neurodegenerative disorders and in neural degeneration. 

35 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 120 

One embodiment of this gene has homology to a hypothetical protein in 
Schizosaccharomyces pombe (See Accession No. 2281980). Another embodiment for 
this gene is the polypeptide fragments comprising the following amino acid sequence: 
5 lYHLHSAVIFFHFKRAFCMCFITMKVIH^ 

(SEQ ID NO:651). An additional embodiment is the polynucleotide fragments 
encoding these polypeptide fragments. This gene maps to chromosome 18, and 
therefore, may be used as a marker in linkage analysis for chromosome 18, 

This gene is expressed primarily in adult hypothalamus and to a lesser extent in 
10 infant brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neiuodegenerative disorders; endocrine function; and vertigo. Similarly, 

15 polypeptides and antibodies directed to these polypeptides are useftil in providing 

immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the brain, CNS and 
peripheral nervous system, expression of this gene at significanUy higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 

20 tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken firom an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

25 corresponding to this gene are useful for the treatment and diagnosis of 

neurodegenerative disorders; diagnosis of tumors of a brain or neuronal origin; 
treatments involving hormonal control of the entire body and of homeostasis, 
behavioral disorders, such as Alzheimer's Disease, Paricinson's Disease, Huntington's 
Disease, schizophrenia, mania, dementia, paranoia, obsessive compulsive disord^ and 

30 panic disorder. In addition, the gene or gene product may also play a role in the 

treatment and/or detection of developmental disorders associated with the developing 
embryo. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 121 
35 The translation product of this gene shares sequence homology with the human 

IRLB protein which is thought to be important in binding to a c-myc promoter element 
and thus regulating its transcription (See Accession No. gil33969). This gene maps to 
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chromosome 1 , and therefore, may be used as a marker in linkage analysis for 
chromosome 1. 

This gene is expressed primarily in brain and breast and to a lesser extent in a 
variety of hematopoietic tissues and cells. 
5 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell ,type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancer of the brain and breast; lymphoproliferative disorders; 
neurodegenerative diseases. Similarly, polypeptides and antibodies directed to these 

10 polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the CNS, breast, and immune system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 

15 fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e„ the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the treatment and diagnosis of cancer of the 

20 brain, breast, and hematopoietic systenL In addition, it may be useful for the treatment 
of neurodegenerative disorders, as well as disorders of the hematopoietic system, 
including defects in immune competency and inflammation. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and 
immunotherapy targets for the above listed tumors and tissues. 

25 

FEATURES OF PROTEIN ENCODED BY GENE NO: 122 

The translation product of this gene shares sequence homology with an ATP 
synthase, a key component of the proton channel that is thought to be important in the 
translocation of protons across the membrane. 

30 This gene is expressed primarily in T cell lymphoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, T cell lymphoma. Similarly, polypeptides and antibodies directed to these 

35 polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immime system, expression of this gene at significantly higher or 
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lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
S fluid from an individual not having the disorder. 

The tissue distribution and homology to ATP synthase indicates tfiat 
polynucleotides and polypeptides corresponding to this gene are useful for the treatment 
of defects in proton transport, homeostasis, and metabolism, as well as the diagnosis 
and treatment of lymphoma. Because the gene is expressed in cells of lymphoid origin, 
10 the natural gene product may be involved in inmiune functions. Therefore it may be also 
used as an agent for immunological disorders including arthritis, asthma, immune 
deficiency diseases such as AIDS, and leukemia 

FEATURES OF PROTEIN ENCODED BY GENE NO: 123 

15 This gene maps to chromosome 15, and therefore, may be used as a marker in 

linkage analysis for chromosome 15. 

This gene is expressed primarily in a variety of fetal tissues, including fetal 
liver, lung, and spleen, and to a lesser extent in a variety of blood cells, including 
eosinophils and T cells. 

20 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancer (abnormal cell proliferation); T cell lymphomas; and hematopoietic 
disorders. Similarly, polypeptides and antibodies directed to these polypeptides are 

25 useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the fetus and immune system, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 

30 another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the treatment and diagnosis of conditions 

35 involving cell proliferation. Expression of this gene in fetal tissues, as well as in a 
variety of blood cell Uneages indicates that it may play a role in either cellular 
proliferation; apoptosis; or cell survival. Thus it may be useful in the management and 
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treatment of a variety of cancers and malignancies. In addition, its expression in blood 
cells suggest that it may play additional roles in hematopoietic disorders and conditions, 
and could be useful in treating diseases involving autoinununity, immune modulation, 
immune surveillance, and inflammation.. 

5 

FEATURES OF PROTEIN ENCODED BY GENE NO: 124 

This gene is expressed primarily in placenta and to a lesser extent in pineal gland 
and rhabdomyosarcoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

10 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental, endocrine, and female reproductive disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 

15 a number of disorders of the above tissues or cells, particularly of the [insert system 
where a related disease state is Ukely, e.g., immune], expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 

20 such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 357 as residues: 
Leu-69toVal-76. 

The tissue distribution indicates that polynucleotides and polypeptides 

25 corresponding to this gene are useful for diagnosis and treatment of disorders in 
development Protein, as well as, antibodies directed against the protein may show 
utility as a tumor marker and immunotherapy targets for the above listed tumors and 
tissues. 

30 FEATURES OF PROTEIN ENCODED BY GENE NO: 125 

This gene is expressed primarily in benign prostatic hyperplasia. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for difierential identification of the tissue{s) or cell type(s) present in a 
biological sample and for diagnosis of benign prostatic hyperplasia Similarly, 
35 polypeptides and antibodies directed to these polypeptides are useful in providing 

inmiunological probes for differential identification of the tissue(s) or ceU type(s). For 
a number of disorders of the above tissues or cells, particularly of the reproductive 
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system, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
senun, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken fipom an individual having such a disorder, relative to the standard gene 
5 expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment and diagnosis of benign prostatic 
hyperplasia. Protein, as well as, antibodies directed against the protein may show utihty 
10 as a tumor marker and inununotherapy targets for the above listed tumors and tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 126 

This gene is expressed primarily in apoptotic T-cells and to a lesser extent in 
suppressor T cells and ulcerative colitis. 

15 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases involving premature apoptosis, and immunological and 
gastrointestinal disorders. Similarly, polypeptides and antibodies directed to these 

20 polypeptides are useful in providing inununological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 

25 another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment and diagnosis of disorders involving 

30 in^propriate levels of apoptosis, especially in immune cell Hneages. Because the gene 
is expressed in cells of lymphoid origin, the natural gene product may be involved in 
immune functions. Therefore it may be also used as an agent for immunological 
disorders including arthritis, asthma, inmiune deficiency diseases (such as AIDS), and 
leukemia 



35 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 127 
This gene is expressed primarily in Raji cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
5 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflanunation and T cell autoinmiune disorders. Similarly,- polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the immune system, expression of 

10 this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

15 disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 360 as residues: Asp-23 to Gly-29. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment and diagnosis of inflammation and T 
cell autoimmune disorders. Because the gene is expressed in cells of lymphoid origin, 

20 the natural gene product may be involved in immune functions. Therefore it may be also 
used as an agent for inununological disorders including arthritis, asthma, inmiune 
deficiency diseases (such as AIDS), and leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 128 
25 The translation product of this gene shares sequence homology with an C 

elegans coding region C47D12^ of unknown function (See Accession No. 

gnllPIDIe348986). One embodiment for this gene is the polypeptide fragments 

comprising the following amino acid sequence: EDIXJFNRSIHEVILKNrrWY 

SERVLTEISLGSllJa.VVniTIQYNMmTRDKYLH^ 
30 AAQRnSIJFSLIJSKKHNKVIiQATQSLRGSLSSNDW^^ 

I^IINSCLTNSIJIHNPNLVALLYKRDLFEQFRTHPSFQDM 

QAGS (SEQ ID NO:657); EDIXjFNRSIHEVILKNrrWYSERVLTmSLGSUJ^ 

(SEQ ID NO:658); RTIQYNMTRTRDKTlirmClj\A^ 

RHSLFSLLSKKHN (SEQ ID NO:659); KKHNKVLEQATQSLRGSLSSNDVPLPDY 
35 AQD (SEQ ID NO:661); SCLTNSUIHNPNLVYALI.YKRDLFEQFRTHPSFQD 
IMQNIDLVISFFSSRLLQAGS (SEQ ID NO:660). An additional embodiment is the 
polynucleotide fragments encoding these polypeptide fragments. This gene maps to 
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chromosome 18, and therefore, may be used as a marker in linkage analysis for 
chromosome 18. 

This gene is expressed primarily in smooth muscle and to a lesser extent in fetal 

liver. 

5 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, atherosclerosis and other cardiovascular and hepatic disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 

10 immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the circulatory 
system, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 

1 5 taken from an individual having such a disorder, relative to the standard gene 

expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene arc useful for diagnosis and treatment of circulatory system 

20 disorders such as atherosclerosis, hypertension, and thrombosis . In addition, the tissue 
distribution indicates that polynucleotides and polypeptides corresponding to this gene 
are useful for the detection and treatment of liver disorders and cancers (e.g. 
hepatoblastoma, jaundice, hepatitis, Uver metabolic diseases and conditions that are 
attributable to the differentiation of hepatocyte progenitor cells). In addition the 

25 expression in fetus would suggest a useful role for the protein product in developmental 
abnormaUties, fetal deficiencies, pre-natal disorders and various would-healing models 
and/or tissue trauma. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 129 
30 The translation product of this gene shares sequence homology with a ribosomal 

protein which is thought to be important in cellular metabolism, in addition to the 
C.elegans protein F40F1 1 . 1 which does not have a known function at the current time 
(See Accession No. gnllPIDIe244552 ). Preferred polypeptide fragments comprise the 
following amino acid sequence: 
35 MADIQTERAYQKQPTIFQNKKRV1XGETGKEKIJ>RV^^ 

PRRIXRGTYIDKK(3*FrGNVSIRGRII^GVVTQDEDAEDHCI^ 
PLREAPQEHVCIPVPL LCJGRPDR (SEQ ID NO:662); MKMQRTIVIRRDYLH 
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YmKYNRFEKRHKNMSVHl^PaTU)VQra^ 

AAGTKKQFQKF (SEQ ID NO:663); MADIQTERAYQKQPTIFQNKKRVLLGET 
GK (SEQ ID NO:664); HCHPPRLSALHPQVQPLREAPQEHVCTPVPL LQGRPDR 
(SEQ ID NO:666); MGLGFKDTPRRlXRGTYroKXCPF^G^^VSIRGRIl^GVW 
5 (SEQ ID NO:669); MmQRTTVaRRDYLHYIRKYNRFEKRHKNMSVHLSP (SEQ 
ID NO:667); CFRDVQIGDrVTVGECRPl^KTVRFN\a.K\aX^ 
(SEQ ID NO:668). Also preferred are polynucleotide fragments encoding these 
polypeptide fragments. 

This gene is expressed primarily in Wilm's tumor and to a lesser extent in 

10 thymus and stromal ceUs. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases affecting RNA translation. Similarly, polypeptides and 

15 antibodies directed to these polypeptides arc useful in providing inmiunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the Wilm's tumors, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 

20 fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ED NO: 362 as residues: 
Thr-11 toAsp-20. 

25 The tissue distribution and homology to a ribosomal protein indicates that 

polynucleotides and polypeptides corresponding to this gene are useful for diseases 
affecting RNA translation. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 130 

30 The translation product of this gene shares sequence homology with a yeast 

DNA helicase which is thought to be important in global transcriptional regulation (See 
Accession No. gnllPIDle243594). One embodiment for this gene is the polypeptide 
fragments comprising the following amino acid sequence: IFYDSDWNPTVDQQA 
MDRAHRIXKJTKQVTVYRUCKGTIEERILQRAKEK^ (SEQ ID 

35 NO:670); TRMmU^YMVYRiarrrai[X>GSSKISERRDM^ 

FVFLLSTRAGGLGINLTAXDTVHF (SEQ ID NO:671); TRMIDLLEEYMVYRK 
HTYXRLDGSSKISERRDM (SEQ ID NO:674); RRDMVADFQNRNDIFVFLL 
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STRAGGLGINLTAXDTVHF (SEQ ID NO:675) , IFYDSDWNPTVDQQAMD 
RAHRLGQTKQVTVYRLICKG (SEQ ID NO:676); RLICKGTIEERILQRAK 
EKSEIQRMVISG (SEQ ID NO:678). An additional embodiment is the polynucleotide 
fragments encoding these polypeptide fragments. 
5 This gene is expressed primarily in amygdala. 

Therefore, polynucleotides and polypeptides of the invention are usefiil as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases and disorders of the brain. Similarly, polypeptides and 

10 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the central nervous system, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 

15 urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution and homology to a DN A helicase indicates that 
20 polynucleotides and polypeptides corresponding to this gene are useful for diseases 
affecting RN A transcription, particularly developmental disorders and healing wounds 
since the later are though to approximate developmental transcriptional regulation. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 131 

25 This gene is expressed primarily in prostate and to a lesser extent in amygdala 

and pancreatic tumors. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

30 not limited to, prostate enlargement and gastrointestinal disorders, particularly of the 
pancreas and gall bladder. Similarly, polypeptides and antibodies directed to these 
polypeptides arc useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or ceUs, 
particularly of the reproductive system, expression of this gene at significantly higher or 

35 lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken fix)m an individual having such a disorder, relative to 
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the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment and diagnosis of prostate diseases, 
5 including benign prostatic hyperplasia and prostate cancer. In addition, the tissue 
distribution in tumors of the pancreas indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and intervention of these tumors, in 
addition to other tissues where expression has been indicated. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 
10 inmiunotherapy targets for the above listed tumors and tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 132 

This gene is expressed primarily in adult lung and to a lesser extent in 
hypothalamus. 

15 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, puhnonary diseases and neurological disorders. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing inmiunological 

20 probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the pulmonary and respiratory 
systems, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 

25 taken fh)m an individual having such a disorder, relative to the standard gene 

expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of pulmonary and 

30 respiratory disorders such as emphysema, pneumonia, and pulmonary edema and 
emboli. In addition, the tissue distribution indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for the detection/treatment of 
neurodegenerative disease states and behavioral disorders such as Alzheimer's Disease, 
Paridnson's Disease, Huntington's Disease, schizophrenia, mania, dementia, paranoia, 

35 obsessive compulsive disorder and panic disorder. In addition, the gene or gene 
product may also play a role in the treatment and/or detection of developmental 
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disorders associated with the developing embryo, sexually-linked disorders, or 
disorders of the cardiovascular systent 

FEATURES OF PROTEIN ENCODED BY GENE NO: 133 
5 This gene is expressed primarily in human liver. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cirrhosis of the liver and other hepatic disorders. Similarly, polypeptides 

10 and antibodies directed to these polypeptides are useful in providing immimological 
probes for differential identification of die tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the digestive system, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 

15 unne, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
20 corresponding to this gene are usefiil for diagnosis and treatment of liver disorders such 
as cirrtiosis, jaundice, and Hepatitus. Protein, as well as, antibodies directed against the 
protein may show utility as a tumor marker and/or immunotherapy targets for the above 
listed tissues. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 134 

This gene is expressed primarily in fetal kidney and to a lesser extent in fetal 
Uver and spleen. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

30 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, development and regeneration of liver and kidney and immunological 
disorders. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of die tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 

35 the digestive and excretory systems, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
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another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 367 as residues: Pro-70 to Arg-77, Tyr- 
5 102toThr-107. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of diseases of the 
kidney and liver, such as cirrhosis, kidney failure, kidney stones, and liver failure, 
hepatoblastoma, jaundice, hepatitis, liver metabolic diseases and conditions that are 
10 attributable to the differentiation of hepatocyte progenitor cells, hi addition the 

expression in fetus would suggest a useful role for the protein product in developmental 
abnormalities, fetal deficiencies, pre-natal disorders and various would-healing models 
and/or tissue trauma. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 135 

This gene is expressed primarily in brain, bone marrow, and to a lesser extent in 
placenta, T ceU, testis and neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurodegenerative and immunological diseases and cancer. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the nervous and 

25 immune systems, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 

30 individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 368 as residues: Met-1 to His-6. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are usefiil for the detection/treatment of neurodegenerative 
disease states and behavioral disorders such as Alzheuner's Disease, Parkinson's 

35 Disease, Huntingtons IMsease, schizophrenia, mania, dementia, paranoia, obsessive 
compulsive disorder and panic disorder. In addition, the gene or gene product may also 
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play a role in the treatment and/or detection of developmental disorders associated with 
the developing embryo, or sexually-linked disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 136 
5 Translatation product of this gene is homologous to the human WD repeat 

protein HANI 1. Preferred polypeptide fragments comprise the following amino acid 
sequence: 

MSUIGKRKEIYKYEAPWTVYAMNWSVRPDKRFRl^ 
LDEESSEnCRNTFDHPYPTTKLMWDTKG\^DLLATSGDYLR\^ 
10 RLECLLNNNKNSDFCAPLTSFDWNEVDPYLLGTSSIDTTCTIWGl^^ 
NLVSGHVKTQIJL\HDKE\rmiAFSRAGGGRDMFASVGAI^ 
ST[IYEDPQHHP1JJU.CWNKQDPNY1ATMAI^^ 

HVSMAIXGPHIHPATSALQRMTTRI^SGTSSKCPEPLRTI^WPTQIJCGEIN^ 
WASTQPELSPSATTTAWRYSECS VGGAVPTRQGLLYFLPLPHPQS (SEQ ID 

15 NO:679); MSLHGKRKEIYKYEAPWTVYAMNWSVRPDKRF^ 
EEYNNKVQLVGLJ)EESSEnCRNTFDHPYFITKlA^^ 
LRVWRVGETETRI^ail^NNKNSDFCAPLTSFDWNEV^ (SEQ ID 
NO:680); SFDWNEVDPYLLCnrSSroTTCTIWGLErGQVL^ 
TQLIAHDKE\nfDIAFSRAGGGRDMFASVGAIX;SV^^ 

20 HPLLRIX:WNKQDPNYI^TMAMDGMEV\aLDWW (SEQ ID 

NO;681); VGADGSVRMFDLRHLEHSTHYEDPQHHPLLRLC^ 
TMAMDGMEVVnjDWWAHIJaXjn^^ 

SGTSSKa>EPUlTLSWPTQIJCGEINNVQWASTQPEI^PSATTrAWR 
GAVPTRCyJLLYFLPLPHPQS (SEQ ID NO:682). Also preferred are polynucleotide 

25 firagments encoding these polypeptide fhigments. 

This gene is expressed primarily in placenta, embryo, T ceil and fetal lung and 
to a lesser extent in endothelial, tonsil and bone marrow. 

Therefore, polynucleotides and polypeptides of the invention arc useful as 
reagents for differential identification of the tissue(s) or ceU type(s) present in a 

30 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immunological and developmental diseases in addition to cancers. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 

35 immune system, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
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cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO; 369 as residues: Gly-19 to Gln-28, Pro-36 to Phe-42. 
5 The tissue distribution in tumors of colon, ovary, and breast origins indicates 

that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and intervention of these tumors, in addition to other tumors where 
expression has been indicated. Protein, as well as, antibodies directed against the 
protein may show utility as a tumor marker and/or inmiunotherapy targets for the above 
10 listed tumors and tissues. Because the gene is expressed in cells of lymphoid origin, the 
natural gene product may be involved in inunune functions. Therefore it may be also 
used as an agent for immunological disorders including arthritis, asthma, inunune 
deficiency diseases such as AIDS, and leukemia. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 137 

This gene is expressed primarily in TNF and INF induced epithelial ceUs, T 
cells and kidney. 

Therefore, polynucleotides and polypeptides of the invention arc useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammatory conditions particularly inflanmiatory reactions in the 
kidney. Similarly, polypeptides and antibodies directed to these polypeptides are useful 
in providing immimological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or ceDs, particularly of renal 

25 system, expression of this gene at significandy higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken fix)m an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily flmd from an 

30 individual not having the disorder. Preferred epitopes include those comprising a 

sequence shown in SEQ ID NO: 370 as residues: Thr-67 to Gly-72, Gin- 132 to Ala- 
145, Arg-150toPro-157. 

The tissue distribution indicates that die protein products of this gene are useful 
for treating the damage caused by inflammation of the kidney. 



35 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 138 

This gene maps to chromosome 1, and therefore, may be used as a marker in 
linkage analysis for chromosome 1 (See Accession No. D63485). 

This gene is expressed primarily in breast cancer and colon cancer and to a 
5 lesser extent in thymus and fetal spleen. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell typc(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancers, especially of the breast and colon tissues. Similarly, 
10 polypeptides and antibodies directed to these polypeptides are useful in providing 

inmiunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the inunune system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., senun, 
1 5 plasma, urine, synovial fluid or spinal fluid) or another tissue or ceU sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution in tumors of colon and breast origins indicates that 
20 polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 
and intervention of these tumors, in addition to other tumors where expression has been 
indicated. Protein, as well as, antibodies directed against the protein may show utiUty as 
a tumor marker and/or immunotherapy targets for the above listed tumors and tissues. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 139 

This gene m^s to chromosome 17, and therefore, can be used as a marker for 
linkage analysis from chromosome 17. 

This gene is expressed primarily in CD34 positive ceUs, and to lesser extent in 
activated T-cells and neutrophils. 

30 Therefore, polynucleotides and polypeptides of the invention are useftil as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immunologically related diseases and hematopoietic disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 

35 immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune system 
and hematopoietic system, expression of this gene at significantly higher or lower levels 
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may be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
5 fluid from an individual not having the disorder. 

The tissue distribution in CD34, T-ceU and neutrophils indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for treatment 
and diagnosis of hematopoietic disorders and inmiunologically related diseases, such as 
anemia, leukemia, inflammation, infection, allergy, immunodeficiency disorders, 
10 arthritis, asthma, inmiune deficiency diseases such as AIDS. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 140 

This gene was recentiy cloned by another group, who called the gene 
KIAA0313 gene. (See Accession No. dl021609.) Preferred polypeptide fragments 

1 5 comprise the amino acid sequence: 

LYATATVISSPSTEO^QDQGDRASLJDAADSGRGSWTSCSSGSHDNIQTIQ 
HQRSWETLPFGHTHFDYSGDPAGLWASSSHMDQn^SDHSTKYNRQN(JS^^ 
LEQACJSRASWASSTGYWGEDSEGDTGTIKRRGGKDVSffiAESSSLTSVTr^^ 
PWMPAHIAVASSTrKGIJL\RKEGRYREPPPIPPGYIG]PTO 

20 PDYNVALQRSRMVARSSDTAGPSSVQQPHGHPTSSRPVNKPQWHKXNESDPR 
LAPYQSQGFSTEEDEDEQVSAV (SEQ ID NO:683); HMDQIMFSDHSTKYNRQ 
NQSRESLEQACJSRASWASSTGYWGE (SEQ ID NO:684); SVTTEETKPVPMP 
AHIAVASSTTKGUARKEGRYREPPPrPPGYIGIPrrD (SEQ ID NO:685); and 
VALQRSRMVARSSDTAGPSSVQQPHGHPTSSRPVNKPQW 

25 HKXNESDPRLAPYCJSCy}? (SEQ ID Na686). Also preferred are polynucleotide 
fragments encoding these polypeptide fragments. This gene maps to chromosome 4, 
and therefore, may be used as a marker in linkage analysis for chromosome 4 (See 
Accession No. AB002311 ). 

This gene is expressed primarily in ovarian cancer, tumors of the Testis, brain, 

30 and colon. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, ovarian, testicle, brain and colon cancers. Similarly, polypeptides and 
35 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the male and female reproductive systems, 
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expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
5 the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution in tumors of colon, ovary, testis, and brain origins 
indicates that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and intervention of these tumors, in addition to other tumors where 
10 expression has been indicated. Protein, as well as, antibodies directed against the 

protein may show utility as a tumor marker and/or inununotherapy targets for the above 
listed tumors and tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 141 

15 This gene is expressed primarily in spleen and colon cancer. 

Therefore, polynucleotides and polypeptides of the invention are usefid as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, colon cancer and immunological disorders. Similarly, polypeptides and 

20 antibodies directed to these polypeptides are useful in providing inununological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the gastrointestinal trace and immune 
systems, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 

25 serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution in tumors of colon, ovary, and breast origins indicates 

30 that polynucleotides and polypeptides corresponding to this gene are usefiil for 
diagnosis and intervention of these tumors, in addition to other tumors where 
expression has been indicated. Protein, as well as, antibodies directed against the 
protein may show utility as a tumor marker and/or immunotherapy targets for the above 
listed tumors and tissues. 



35 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 142 

Translation product is homologous to T cell translocation protein, a putative zinc 
finger factor (See Accession No. 340454), as well as to the G-protein coupled receptor 
TM5 consensus polypeptide (See Accession No, R50734). Preferred polypeptide 
5 fragments con^)rise the following amino acid sequence: 

aii^VFVSLGMROiWTIVYNV^ (SEQ ID NO:687); 

ACSKLIPAFEMVMRAKDNVYHLDCTACQLCNQRXCVGDKFFLK^ 
DYEEGLMKEGYAPXVR (SEQ ID NO:688). Also preferred are polynucleotide 
fragments encoding these polypeptide fragments. 

10 This gene is expressed primarily in fetal brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not hmited to, neurological disorders including brain cancer. Similarly, polypeptides 

1 5 and antibodies directed to these polypeptides are useful in providing inmiunological 
probes for differential identification of the tissue(s) or cell type{s). For a number of 
disorders of the above tissues or cells, particularly of the Central Nervous System, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 

20 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
25 corresponding to this gene are useful for the detection/treatment of neurodegenerative 
disease states and behavioral disorders such as Alzheimer's Disease, Parkinson's 
Disease, Huntingtons EHsease, schizophrenia, mania, dementia, paranoia, obsessive 
compulsive disorder and panic disorder. In addition, the gene or gene product may also 
play a role in the treatment and/or detection of developmental disorders associated with 
30 the developing embryo, 

FEATURES OF PROTEE* ENCODED BY GENE NO: 143 

Translation product for this gene has significant homology to the Fas ligand, 
which is a cysteine-rich type n transmembrane protein/tumor necrosis factor receptor 
35 homolog. Mutations within this protein have been shown to result in generalized 
lynq)hoproliferative disease leading to the development of lymphadenopathy and 
autoimmune disease (See Medline Article No. 94 1 85 1 75). Preferred polypeptide 
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fragments comprise the following amino acid sequence: 

SA1JSEPGAPDRRRPCTESWRRPDDEQWPPPTALC1X>VAPIJ>PSS (SEQ ID 
NO:689). Also preferred are polynucleotide fragments encoding these polypeptide 
fragments (See Accession No. 473565). 
5 This gene is expressed primarily in osteoblasts, lung, and brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, osteoblast-related, pulmonary, neurological, and immunological 

10 diseases. Similarly, polypeptides and antibodies directed to these polypeptides are 

useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a nimiber of disorders of the above tissues or cells, particularly of 
the skeletal and nervous systems, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 

15 tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken firom an individual having such a disorder, relative to 
the standard gene expression level, i,e., the expression level in healthy tissue or bodily 
fluid fix)m an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 376 as residues: Trp-33 to Thr-40, Lys- 

20 45toIle-63. 

The tissue distribution in osteoblasts, lung, and brain combined with its 
homology to the Fas ligand indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and intervention of these tumors, in 
addition to other tumors where expression has been indicated. Protein, as well as, 

25 antibodies directed against the protein may show utility as a tumor marker and/or 

immunotherapy targets for the above listed tumors and tissues. Because the Fas ligand 
gene is known to be expressed in cells of lynqphoid origin, the natural gene pioduct 
may be involved in immime functions. Therefore it may be also used as an agent for 
immiinological disorders including asthma, immune deficiency diseases such as AIDS 

30 and leukemia, and various autoimmune disorders including lupus and arthritis. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 144 

This gene shares sequence homology with a 21.5 KD transmembrane protein in 
the SEC15-SAP4 intergenic region of yeast. (See Accession No. 1723971.) Preferred 
35 polypeptide fragments conq)rise the amino acid sequence: 

AHASESGERWWACCGVRFGLRSIEAIGRSCCHDGPGGLVANRGRRFKWA^ 
SGPGGGSRGRSDRGSGQGDSLYPVGYLDKQWDTSVQEn)RILVEKRCWDIAL 
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GPLKQIPMNnLFIMYMAGNTISIFPTM^ 
KFLQGLVYLIGNmGI^AVYKCQSMGUJmiASDX^^ 
GLLL (SEQ ID NO:691); PVGYLDKQVPDTSVQETDRILVEKRCWDIALGPLKQ 
IPMNLH (SEQ ID NO:693); and ATFKMLESSSQKFLQGLXnfLIGNUVlGLAl^ 
5 YKCQSMGLLPTTHASD (SEQ ID NO:692). Also preferred are polynucleotide 
fragments encoding these polypeptide fragments. 

This gene is expressed primarily in osteoclastoma, hemangiopericytoma, liver, 

lung. 

Therefore, polynucleotides and polypeptides of the invention are usefid as 
10 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, osteoclastoma, hemangiopericytoma, liver and lung tumors. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the above tissue(s) or cell 
15 type(s). For a number of disorders of the above tissues or cells, particularly of the lung 
and liver systems, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
20 gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosing osteoclastoma, 
hemangiopericytoma, liver and lung tumors. 

25 

FEATURES OF PROTEIN ENCODED BY GENE NO: 145 

Translation product of this gene shares homology with the gIucagon-69 gene 
which may indicate this gene plays a role in regulating metabolism. (See Accession No. 
A60318) One embodiment for this gene is the polypeptide fragments comprising the 

30 following amino acid sequence: 

PTTKLDMEKKKfflQIRFPSFYHKLVDSGRMRSK^ (SEQ ID 

NO:694). An additional embodiment is the polynucleotide fragments encoding these 
polypeptide fragments. 

This gene is expressed primarily in brain, kidney, colon, and testis. 

35 Therefore, polynucleotides and polypeptides of the invention are useftd as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
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not limited to, brain, kidney, colon, and testicular cancer. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing inununological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the male reproductive system, neurological, 
5 circulatory, and gastrointestinal systenis, expression of this gene at significantly higher 
or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 

10 or bodily fluid from an individual not having the disorder. 

The tissue distribution in tumors of brain, kidney, colon, and testis origins, 
indicates that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and intervention of these tumors, in addition to other tumors where 
expression has been indicated- Protein, as well as, antibodies directed against the 

1 5 protein may show utility as a tumor marker and/or immunotherapy targets for the above 
listed tumors and tissues. The tissue distribution indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for the detection/treatment of 
neurodegenerative disease states and behavioral disorders such as Alzheimer's Disease, 
Parkinson's Disease, Huntingtons Etisease, schizophrenia, mania, dementia, paranoia, 

20 obsessive compulsive disorder and panic disorder. In addition, the gene or gene 
product may also play a role in the treatment and/or detection of developmental 
disorders associated with the developing embryo, sexuaDy-linked disorders, or 
disorders of the cardiovascular system, 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 146 

TTie translation product of this gene shares sequence homology with goliath 
protein which is thought to be important in the regulation of gene expression during 
development. Protein may serve as a transcription factor One embodiment for this gene 
is the polypeptide fragments comprising the following amino acid sequence: 

30 TEHHAVMITELRGKDII^YIJEKMSVQMTIAVG^^ 
lAfflSSAWUFYHQmYTNARDRNQRRUJDAAK^ 
PDFDHCAVOESYKQNDVVRILPCKHVFHKSCVDPWl^ 
LGIV (SEQ ID NO:695); TEHHAVMITEUlGKDII^YIiKMSVQMTIAVGTRMP 
PKNFSRGSLVFVSISFIVLM nSSAWUFYF (SEQ ED NO:697); SISHVLMHSSA 

35 WUFYHQmYTNARDRNQRRLGDAAKKAISKLTTRTV^ (SEQ ID 
NO:698); VKKGDKETDPDFDHCAVaESYKQNDVVIUU>CKHVF^ 
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WLSEHCrCPMCKLNILKALGIV (SEQ ID NO:699). An additional embodiment is 
the polynucleotide iiagments encoding these polypeptide fragments (See Accession No. 
157535). Moreover, another embodiment is the polynucleotide fragments encoding 
these polypeptide fragments: 

5 NmiPGTEffllAVMITEIJRGKDIIJSYIJEKNISVQMmVGTRMPPKNFSR^ 

LWVSISFTVU^nSSAWLIFmQKmYTNARDRNQRRLGDAAKKAISKLTTR^ 
KKGDKETDPDFDHCAVCIESYKQNDWRIU>CKHVFHKSCVDPWLSEHCTCP 
MCKIjra.KAlXjIWNLPCTDNVAFDMERLTRTQAVNlUlSAIXTDI^ 
PLRTSGISPlJ>QIXiELTPRTGEINIAVTKEWFIIASFGlXSALTLCYMIIRATASLN 

1 0 ANEVEWF (SEQ ID NO:696);MTHPGTEHIIAVMrrELRGKDILS YLEKNIS VQM 
TIAVGTRMPPKM^SRGSLVFVSISFIVLMnSSAWLIFYHQKIRYTNARDRNQRR 
LGDAAKKAISKLTTRT (SEQ ID NO:700); AAKKAISKLTTRTVKKGDKE 
TDPDFDHCAVCIESYKQNDVVRILPCKHVFHKSCVDPWLSEHCTCPMCKLNIL 
KALGIVPNLPC (SEQ ED NO:701); TQAVNRRSALGDLAGDNSLGLEPLRTSGI 

15 SPU>QDGELTPRTGEIMAVTKEWFnASFGU;SALTLCYMmiATASIJ^ 

F (SEQ ID NO:702): PLHGVADHLGCDPQTRFFVPPNIKQWIALLQRGNCTF 
KEKISRAAFHNAVAVVIYNNKSKEEPVTNmnXjrEHIIAVMr^^ 
KhnSVQMTXAVGTRMPPKNFSRGSLVFVSISFTVlJ^SSAWLIFYHQKIRYTNA 
RDRNQRRIXjDAAKKAISKLTmTVKKGDKEroPDFDHCAVCIESYKQ>a>VVRI 

20 U<3CHVFHKSCVDPWI^EHCTCPMCKIJra.KAUJr\a»NLPCTO 

RTQAVNRRSAIXJDLAGDNSLGI^PUlTSGISPUXJIXiELTPRTGEINIAVTKEW 
FIIASFGLLSALTLCYMIIRATASLNANEVEWF(SEQ ID NO:703); and 

HGVADHLGCDPQTRFFVPPNIKQWIALLQRGNCmCEKISRAAFHNAVAVVIY 
NNKSKEE (SEQ ID NO:704). An additional embodiment is the polynucleotide 

25 fi:agments encoding these polypeptide firagments. When tested against Juikat cell lines, 
supematants removed from cells containing this gene activated the GAS pathway. 
Thus, it is likely that this gene activates immune cells through the JAKS/STAT signal 
transduction pathway. 

This gene is expressed primarily in macrophage, breast, kidney and to a lesser 

30 extent in synovium, hypothalamus and rtiabdomyosarcoma. 

Therefore, polynucleotides and polypeptides of the invention are usefiil as 
reagents for differmtial identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, schizophrenia and cancer. Similarly, polypeptides and antibodies directed 

35 to these polypeptides are usefiil in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune and neural system, expression of this gene at 
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significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
5 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to zinc finger protein indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for the treatment 
of schizophrenia, kidney disease and other cancers. The tissue distribution in 
macrophage, breast, and kidney origins indicates that polynucleotides and polypeptides 

10 corresponding to this gene are useful for diagnosis and intervention of tumors within 
these tissues, in addition to other tumors where expression has been indicated. Protein, 
as well as, antibodies directed against the protein may show utility as a tumor maricer 
and/or inmiunotherapy targets for the above listed tumors and tissues. Because the gene 
is expressed in cells of lymphoid origin, the natural gene product may be involved in 

1 5 inmiune functions. Therefore it may be also used as an agent for inmiunological 

disorders including arthritis, asthma, inunune deficiency diseases such as AIDS, and 
leukenoda. 



FEATURES OF PROTEIN ENCODED BY GENE NO: 147 

20 The translation product of this gene shares sequence homology with HNP36 

protein, an equilibrative nucleoside transporter, which is thought to be unportant in 
gene transcription as well as serving as an important component of the nucleoside 
transport apparatus (See Accession No. 1845345). One embodiment for this gene is 
the polypeptide fragments conq)rising the following amino acid sequence: 

25 MSGCJGlAGFFASVAMICAIASGSELSESAFGYFrrACAVIILTnC^^ 

YYQQLKI^PGEQETKU)LISKGEEPRAGKEESGVSVSNSQPTNESHSIK^ 
NISVlj\FSVCFIFTmGMFPAVTVEVKSSIAGSSTW 
RSLTAVFMWPGKDSRW1J>SWX1JUU.VF^ 
FFMAAFAFSNGYIj\SLCMCFGPKKVKPAE>^ 

30 PSCSGQLCDKGWTEGIJ>ASIJ>VCliPLPSARGDPEWSGGFFF (SEQ ID 
NO:705); MSGCJGLAGFFASVAMICAIASGSELSESAFGYFITACAVIILTnC 
YLGLPRLEFYRYYQQLKLE GPGEQETKLJiLISKGEEPRAGKEESGVSVSNSQ 
PTNESHSI (SEQ ID NO:706); SGVSVSNSQPTNESHSIKAILKNISVLAFSVCFI 
FIT^GMFPAVTVEVKSSIAGSSTWERYFIPVSCFLTF^^ (SEQ ID 

35 NO:707),TIGMFPAVT\^VKSSL\GSSTWERYFn>VSCFLTFNI^ 

MWPGKDSRWIPSWXLARLWWmJLC^ PRRYLTWFEHDA (SEQ ID 
NO:708); FGPKKVKPAEAETAEPSWPSSCVWVWHWGLFSPSCSGQLCDK 
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GWTEGLPASLPVCLLPLPSARGDPEWSGGFFF (SEQ ID NO:709). An additional 
embodiment is the polynucleotide fragments encoding these polypeptide fragments. 

This gene is expressed primarily in eosinophils and aortic endotheUum and to a 
lesser extent in umbilical vein endothelial cell and thymus. 
5 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, hematopoietic disease. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing inununological probes for differential 

1 0 identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the circular system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or ceU sample taken from an individual having 

1 5 such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to HNP36 protein indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for the treatment 
of blood neoplasias and other hematopoi^c disease. 

20 

FEATURES OF PROTEIN ENCODED BY GENE NO: 148 

This gene is expressed primarily in breast cancer cell lines, thymus stromal 
cells, and ovary. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
25 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sanq)le and for diagnosis of diseases and conditions, which include, but are 
not limited to, endocrine and female reproductive system diseases including breast 
cancer. Similarly, polyp^tides and antibodies directed to these polypeptides are useful 
in providing immunological probes for differential identification of the tissue(s) or cell 
30 type(s). For a number of disorders of the above tissues or cells, particularly of the 

endocrine system, expression of this gene at significandy higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spmal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
35 gene expression level, i.e., the expression level in healthy tissue or bodily fluid fit)m an 
individual not having the disorder. 
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The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis and treatment of endocrine 
disorders. In addition, the tissue distribution in tumors of thymus, ovary, and breast 
origins indicates that polynucleotides and polypeptides corresponding to this gene are 
5 useful for diagnosis and intervention of these tumors, in addition to other tumors where 
expression has been indicated. Protein, as well as, antibodies directed against the 
protein may show utility as a tumor maricer and/or immunotherapy targets for the above 
listed tumors and tissues 

10 FEATURES OF PROTEIN ENCODED BY GENE NO: 149 

Translation product of this gene has homology to pmtl and pmt 2, two 
conserved schizosaccharomyces pombe genes. One embodiment for this gene is the 
polypeptide firagments comprising the foUowing amino acid sequence: 
DDDGFEIWIEDPAKHRILDPEGIJVLGAVIASSKKAK^ 

15 ELPEWFVQEEKQHRIRQLPVGKKEVEHYRKIIWREINARPIXXXXXXXXXXX 
XXXXXXI^QTRKKAEAVVNTVDIXRTRES (SEQ ID NO:710); 
DDIXJFEIWIEDPAKHRILDPEGIjVIXjAVIASSK^ (SEQ 
ID NO:71 1); KRWREINARPIXXXXXXXXXXXXXXXXXLEQTRKKAE 
AWNTVDIXRTRES (SEQ ID Na712). An additional embodiment is the 

20 polynucleotide fragments encoding these polypeptide fragments (See Accession No. 
el216734). 

This gene is expressed primarily in retina and ovary and to a lesser extent in 
brreast cancer cell, epididymus and osteosarcoma. 

Therefore, polynucleotides and polypeptides of the invention are usefril as 

25 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neuronal growth disorders, cancer and reproductive system disorders. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 

30 type(s). For a number of disorders of the above tissues or cells, particularly of the 

neural and reproductive system, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 

35 the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 382 as residues: Met-1 to Gly-7. 
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The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis or treatment of reproductive 
system disease and cancers. 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 150 

One embodiment for this gene is the polypeptide fragments comprising the following 
amino acid sequence: 

MIKDKGRARTALTSSQPAHLCPENPLimKAAVKEKKRNKKK^ 
PLNNKUJSISPAKTLPGACGSPQKLIDGFLKHEGPPAEKPI^E^ 
10 SU3SDPAGCVRPPAPNlAGAVEFNDVKTlXREWIT^SDP^^ 
IJEEKDI£KIJ)LVIKYMKR1MQQSVESVWN^ 

VT (SEQ ID NO:713); MIKDKGRARTALTSSQPAHLCPENPLLHLKAAVKE 
KKRNKKKKTIGSPKRIQ (SEQ ID NO:714); KRIQSPLNNKLLNSPAKT 

LPGACGSPQKLnXJFLKHEGPPAEKPLEELSASTSGVPGLSSLQSDPAGCVRPP 

15 APNIJ^GAVEFNDVKTLLREWirnSDPM (SEQ ID NO:715); 
TISDPMEEDILQVVKYCroUEEKDI£KU)LVIKYMK^^ 
SVWNMAFDFimNVQVVLQQTYGSTLKVT(SEQro An additional 

embodiment is the polynucleotide fragments encoding these polypeptide fragments. 
This gene is expressed primarily in 1 2 week embryo and to a lesser extent in 

20 hemangiopericytoma and frontal cortex. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, growth disorders and hemangiopericytoma. Similarly, polypeptides and 

25 antibodies directed to these polypeptides are usefid in providing inmiunological pn*es 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the circular and neural system, expression 
of this gene at significandy higher or lower levels may be routinely detected in cemin 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 

30 urine, synovial fluid or spinal fluid) or another tissue or cell sample taken bom an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
die expression level in healthy dssue or bodily fluid from an individual not having die 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 383 as residues: Leu-4 to Lys-1 1 . 

35 The tissue distribution indicates diat polynucleotides and polypeptides 

corresponding to this gene arc usefiil for the treatment of growth disorders, 
hemangiopericytoma and other soft tissue tumors. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 151 

The translation product of this gene has been found to have homology to a 

human DNA mismatch repair protein PMS3. Preferred polypeptide fragments comprise 
5 the following amino acid sequence: FCHDCKFPEASPAMNCEP (SEQ ID NO:7 17). 

Also preferred are polynucleotide fragments encoding these polypeptide fragments (See 

Accession No. R95250). 

This gene is expressed primarily in neutrophils. 

Therefore, polynucleotides and polypeptides of the invention arc useful as 

10 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, lymphoma, immunodeficiency diseases, and cancers restdting from 
genetic instability. Similarly, polypeptides and antibodies directed to these polypeptides 
are useful in providing inmiunological probes for differential identification of the 

1 5 tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune systenfi, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken ftom an individual having such a disorder, relative to 

20 the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 384 as residues: Met-1 to Lys-6. 

The tissue distribution in neutrophils and the sequence homology indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosis of 

25 Hodgkin's lymphoma, since the elevated expression and secretion by the tumor mass 
may be indicative of tumors of this type. Additionally the gene product may be used as 
a target in the inununother^y of the cancer. Because the gene is expressed in cells of 
lymphoid origin, the natural gene product may be involved in immune functions. 
Therefore it may be also used as an agent for immunological disorders including 

30 arthritis, asthma, immune deficiency diseases such as AIDS, and leukemia. 

Furthermore, its homology to a known DNA repair protein would suggest gene may be 
usefiil in establishing cancer predisposition and prevention in gene therapy applications. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 152 
35 This gene is expressed primarily in neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
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biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, infectious diseases and lymphoma. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing inununological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
5 of the above tissues or ceUs, particularly of the immune system, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
10 in healthy tissue or bodily fluid firom an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the treatment of inflanunation and infectious 
diseases. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 153 

One embodiment for this gene is the polypeptide firagments comprising the 
following amino acid sequ^ce: 

MASSVPAGGHTRAGGIFUGKU)LEASLFKSFQWIJ>FV^ 
NFFCWDSSAHSlJPIJiPl^ASCSAPACHASDTHIXYPST^^ 

20 VFRTNAPGPTPSSQSSFVFPVFPVSFMALIVCXLVCC (SEQ ID NO:720); 
MASSWAGGHTRAGGIFLIGKLDLEASLFKSFQWLPFVLRKKCNFFC\^ 
SLPLHPLSASCSAPACHA (SEQ ID NO:721) J^AWLVAPHSVFRTNAPGPTPS 
SQSSPVFPVFPVSFMALJVCXLVCC(SEQIDNO:722). An additional embodiment 
is the polynucleotide fragments encoding these polypeptide fragments. 

25 This gene is expressed primarily in neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are usefid as 
reagents for differential identification of the tissue(s) or ceU type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflanunation and infectious disease. Similarly, polypeptides and 

30 antibodies directed to these polypeptides arc useful in providing inmiunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the inmiune system, expression of this gene 
at significantly higher or lower levels may be routmely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 

35 fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid finom an individual not having the disorder. Preferred 
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epitopes include those comprising a sequence shown in SEQ ID NO: 386 as residues: 
Ser-11 to Pro- 17. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the treatment of infectious diseases and 
S inflammation. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 154 

This gene is expressed in multiple tissues including ovary, uterus, adipose 
tissue, brain, and the liver. 

10 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, uterine, ovarian, brain, and liver cancer. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful m providing immunological probes 

15 for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the female reproductive system, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 

20 individual having such a disorder^ relative to the standard gene expression level, i.e., 
the expression level in l^althy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution of this gene indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for diagnostic or therapeutic uses in 
25 the treatment of the female reproductive system, obesity, and liver disorders, 
particularly cancer in the above tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 155 

This gene maps to chromosome 3, and therefore, may be used as a marker in 
30 linkage analysis for chromosome 3 (See Accession No. D87452). 

This gene is expressed in multiple tissues including brain, aortic endothelial 

cells, smooth muscle, pituitary, testis, melancytes, spleen, nertrophils, and placenta. 
Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
35 biological sample and for diagnosis of diseases and conditions, which include, but are 

not limited to, immunological disorders including immunodeficiencies, cancers of the 

brain and the female reproductive system, as well as cardiovascular disorders, such as 
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atherosclerosis and stroke. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the central nervous and immune systems, expression of this gene at 
5 significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 
10 The tissue distribution suggest that polynucleotides and polypeptides 

corresponding to this gene are useful in treatment/detection of disorders in the nervous 
system, including schizophrenia, neurodegeneration, neoplasia, brain cancer as well as 
cardiovascular and female reproductive disorders including cancer within the above 
tissues. 

15 

FEATURES OF PROTEIN ENCODED BY GENE NO: 156 

The translation product of this gene shares sequence homology with the human 
gene encoding cytochrome b561 (See Accession No. P10897). Cytochrome b561 is a 
transmembrane electron transport protein that is ^)ecific to a subset of secretory vesicles 
20 containing catecholamines and amidated peptides. This protein is thought to supply 
reducing equivalents to the intravesicular enzymes dopamine-beta-hydioxylase and 
alpha-peptide amidase. Preferred polypeptides of the invention comprise the amino acid 
sequence: 

MAMEGYWRFLALLGSAIXVGFI^VIFALVW\^^ 

25 VIJVIWGFVHQGIAIIVYRIP^^ 

NHNVNNIANMYSIJISWVGUAVICYIXQIJ^GFSV^^ 
YSGIVIFGTVIATAIJVIGLTEKIJFSIJUJPAYSTFPPEGV^^ 
WIVTRPQWKRPKEPNSTIIJHPNGGTEQGARGSMPAYSGW 
NSEVAARKRNLALDEAGQRSTM (SEQ ID NO:724); as well as antigenic fragments 

30 of at least 20 amino acids of this gene and/or biologically active fragments. Also 
preferred are polynucleotide fragments encoding these polypeptide fragments. 
This gene is expressed primarily in anergic T-cells, 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell lype(s) present in a 

35 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune system and metabolism related diseases. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 
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probes for differential identification of the tis$ue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the immune system, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
5 urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that the protein product or RNA of this gene is 
10 useful for treatment or diagnosis of immune system and metabolic diseases or 
conditions including Tay-Sachs disease, phenylketonuria, galactosemia, various 
porphyrias, and Hurler's syndrome. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 157 

15 The translation product of this gene shares sequence homology with collagen 

which is inq)ortant in manmialian development. This gene also shows sequence 
homology with bcl-2. (See Accession No. P80988.) Preferred polypeptide fragments 
comprise the amino acid sequence: PGRAGPSPGLSLQLPAEPGHPAGNLAPL 
TSRPQPLCRIPAVPG (SEQ ID NO:725). Also preferred are polynucleotide 

20 sequences encoding this polypeptide firagnoent 

This gene is expressed primarily in HL-60 tissue culture cells and to a lesser 
extent in liver, breast, and uterus. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

25 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immunological diseases, hereditary disorders involving the MHC class 
of inunune molecules, as well as developmental disorders and reproductive disorders. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 

30 type(s). For a number of disorders of the above tissues or cells, particularly of the 
inmiune and reproductive system expression of this gene at significantiy higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 

35 the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
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comprising a sequence shown in SEQ ID NO: 390 as residues: Ser-39 to Gly-46, Leu- 
49 to Ala-62. 

The tissue distribution and homology to collagen indicates that polynucleotides 
and polypeptides corresponding to this gene are useful for diagnosis and treatment of 
5 hereditary MHC disorders and particularly autoimmune disorders including rheumatoid 
arthritis, lupus, scleroderma, and dermatomyositis, as well as many reproductive 
disorders, including cancer of the uterus, and breast tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 158 

10 This gene is expressed primarily in the amygdala region of the brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, a variety of brain disorders, particularly those effecting mood and 

15 personality. Similarly, polypeptides and antibodies directed to these polypeptides are 
usefid in providing inmiunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the brain and central nervous system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 

20 tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder 

The tissue distribution indicates that polynucleotides and polypeptides 

25 corresponding to this gene are useftd for u:eatment and/or diagnosis of a variety of brain 
disorders, particularly bipolar disorder, unipolar depression, and dementia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 159 

This gene is expressed in a variety of tissues and cell types including brain, 
30 smooth muscle, kidney, salivary gland and T-cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancers of a variety of organs including brain, smooth muscle, kidney, 
35 saUvary gland and T-cells and cardiovascular disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
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of the above tissues or cells, particularly of the central nervous, urinary, salivary, 
digestive, and immune systems, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
5 another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution in brain, smooth muscle, and T-cells indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosis of 

10 various neurological, and cardiovascular disorders, but not limited to cancer within the 
above tissues. Additionally the gene product may be used as a target in the 
inmiunotherapy of the cancer. Because the gene is expressed in cells of lymphoid 
origin, the natural gene product may be involved in immune functions. Therefore it may 
be also used as an agent for immunological disorders including arthritis, asthma, 

15 immune deficiency diseases such as AIDS, and leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 160 

The translation product of this gene shares sequence homology with collagen 
which is thought to be important in ceUular int^actions, extracellular matrix formation, 

20 and has been found to be an identifying determinant in autoinunune disorders. 
Moreover, this gene shows sequence homology with the yeast protein, Slslp, an 
endoplasmic reticulum component, involved in the protein translocation process in 
Yeast Yarrowia lipolytica. (See Accession No. 1052828; see also J. BioL Chem. 271, 
1 1668-1 1675 (1996).) With mouse, this same region shows sequence homology with 

25 the heavy chain of kinesin. (See Accession No. 2062607.) Recentiy, suppression of the 
heavy chain of kinesin was shown to inhibits insulin secretion from primary cultures of 
mouse beta-ceUs. (See Endocrinology 138 (5), 1979-1987 (1997).) Moreover, kinesin 
was found associated with drug resistance and cell immortalization. (See 468355.) 
Thus, it is likely that this gene also act as a genetic suppressor elements. 

30 This gene is expressed primarily in the greater omentum and to a lesser extent in 

a variety of organs and cell types including gall bladder, sttonaal bone marrow cells, 
lymph node, liver, testes, pituitary, and thymus. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

35 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, disorders of the endocrine, gastrointestinal, and immunological systems, 
including autoinraiune disorders and cancers in a variety of organs and cell types. 
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Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
immune and gastrointestinal systems, expression of this gene at significandy higher or 
5 lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
10 comprising a sequence shown in SEQ ID NO: 393 as residues: Asn-27 to Leu-47, Gln- 
81 to Lys-88, Asp-93 to Lys-102, Asn-107 to Leu-1 16, Met-129 to Glu-141, Glu-150 
to Asp-157, Lys-176 to Glu-185, Glu-333 to Tyr-349, Cys-393 to Leu-403, Gln-423 
to Gly-429. 

The tissue distribution in within various endocrine and immunological tissues 
1 5 combined with the sequence homology to a conserved collagen motif indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for the 
diagnosis of various autoimmune disorders including, but not limited to, rheumatoid 
arthritis, lupus erthyematosus, scleroderma, dermatomyosids Because the gene is 
expressed in cells of lymphoid origin, the natural gene product may be involved in 
20 inmiune functions. Therefore it may be also used as an agent for inmiunological 

disorders including arthrids, asthma, immune deficiency diseases such as AIDS, and 
leukemia. 



FEATURES OF PROTEIN ENCODED BY GENE NO: 161 
25 This gene has homology to the tissue inhibitor of metalloproteinase 2. Such 

inhibitors are vital to proper reguladon of metalloproteins such as collagenases (See 
Accession No. PI 6368). In addidon, this gene maps to chromosome 17, and 
therefore, may be used as a marker in linka^ analysis for chromosome 17 (See 
Accession No. PI 6368). 
30 This gene is expressed primarily in several types of cancer including 

osteoclastoma, chondrosarcoma, and rhabdomyosarcoma and to a lesser extent in 
several non-malignant dssues including synoviimi, amygdala, testes, placenta. 

Therefore, polynucleoddes and polypeptides of the invention are usefiil as 
reagents for differential identification of the tissue(s) or ceU type(s) present in a 
35 ' biological sample and for diagnosis of diseases and conditions, which include, but arc 
not limited to, various types of cancer, particularly cancers of bone and cartilage, as 
well as various autoimmune disorders. Similarly, polypeptides and antibodies directed 
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to these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the musculoskeletal system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
5 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution in various cancers and the sequence homology to a 
10 collagenase inhibitor indicates that polynucleotides and polypeptides corresponding to 
this gene are useful for detection of various autoimmune disorders such as rheumatoid 
arthritis, lupus, scleroderma, and dermatomyositis. Therefore it may be also used as an 
agent for immunological disorders including arthritis, asthma, inMnune deficiency 
diseases such as AIDS, and leukemia. 

15 

FEATURES OF PROTEIN ENCODED BY GENE NO: 162 

This gene is homologous to the mitochondrial ATP6 gene and therefore is likely 
a homolog of this gene family (See Accession No. X76197). 
This gene is expressed primarily in brain tissue. 

20 Therefore, polynucleotides and polypeptides of the invention are usefiil as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, a variety of brain disorders, including Down*s syndrome, depression. 
Schizophrenia, and epilepsy. Similarly, polypeptides and antibodies directed to these 

25 polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the central nervous system, expression of this gene at significantly higher 
or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 

30 fluid) or another tissue or ceU sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 

The tissue distribution in brain tissue indicates this gene is useful for diagnosis 
of various neurological disorders including, but not limited to, brain cancer. 

35 Additionally the gene product may be used as a target in the immunother^y of cancer in 
the brain as well as for the diagnosis of metabolic disorders such as obesity Tay-Sachs 
disease, phenylketonuria and Hurler's Syndrome. 
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FEATURES OF PROTEIN ENCODED BY GENE NO; 163 

This gene is expressed primarily in placenta, neutrophils, and microvascular 
endothelial cells and to a lesser extent in multiple tissues including brain, prostate, 
5 spleen, thymus, and bone. 

Therefore, polynucleotides and polypeptides of the invention are useftd as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neutropenea and other diseases of the immune system. Similarly, 

10 polypeptides and antibodies directed to these polypeptides are useful in providing 

immunological probes for differential identification of the tissue{s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the inunune system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., senrni, 

15 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution in placenta indicates that polynucleotides and 
20 polypeptides corresponding to this gene are useful for diagnosis various female 

reproductive disorders. Additionally the gene product may be used as a target in the 
inununotherapy of various cancers. Because the gene is expressed in some cells of 
lymphoid and endocrine origin, the natural gene product may be involved in immune 
functions and metabolism regulation, respectively. Therefore it may be also used as an 
25 agent for inmiunological disorders including arthritis, asthma, immune deficiency 
diseases such as AIDS, and leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 164 

This gene is expressed primarily in neutrophils, monocytes, bone marrow, and 
30 fetal liver. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune system disorders including, but not limited to, autoiirunune 
35 disorders such as lupus, and immunodeficiency disorders . Similarly, polypeptides and 
antibodies directed to these polypeptides are usefiil in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
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of the above tissues or cells, particularly of the immune system, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
5 such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution in various immune system tissue indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for the 
diagnosis of various immunological disorders such as Hodgkin's lymphoma, arthritis, 
10 asthma, immune deficiency diseases such as AIDS, and leukemia. 



FEATURES OF PROTEIN ENCODED BY GENE NO: 165 

The translation product of this gene shares sequence homology with dystrophin 
which is thought to be defective in both Duchene and Becker Muscular Dystrophy. 

15 Preferred polypeptide fragments comprise the following amino acid sequence: 
MKIXGECSSSroSVKRL£HKLKEEEESLPGFWfLHSTE^ 
AI^KELJlMKQNUJKW(X?FNSDlJsfSIW 
IKKUCEUJKA\T)HRKAIII^IN1XSPEF^ 
CSUJEEVWlGUXJDAmQCQGFHEMS 

20 QDHHKQlJVlQIKHElJ^QlJlVASIXJDMSCQII.VNAEGTDa-^^ 

LKLLLKEVSRHIKELEKlXDVSSSQQDI^SWSSADEUyrSGSVSPXSGRSTPN^ 
QKTPRGKCSLSQPGPSVSSPHSRSTKGGSDSSLSEPXPGRSGRGFLFRVLRAA 
IPLQllJJLLLIGI^CLWMSEEDYSCAI^NNFARSFI^ (SEQ ID 

NO:726); MKUXiECSSSroSVKRl^HKLKEEEESLPGFVNLHST^ 

25 WEU^AQALSKELRMKQNlXJKWQQFNSDIJSfSIW 

TDIQTIELQIK (SEQ ID NO:727); KLKELQKAVDHRKAEELSINLCSPEFTQADSK 

ESRDLQDRIJCQMNGRWDRVCSLL£EWRGIXQDAIJ4(JC(JGFHE^ 

ENTORRKNEIWroSNUDAEnjQDHHKQLMQIKHELI^Qm^ 

(SEQ ID NO:728); QDMSCQIXVNAEGTD(XEAKEKYHVIGNRIJaJL^^ 

30 RHIKELEKLLDVSSSC^DLSSWSSADELDTSGSVSPXSGRSTPNRQKTPRGKCS 
LSQPGPSVSSPHS (SEQ ID NO:729); DSSLSEPXPGRSGRGFLFRVLRAAL 
PLQLIliJXIGLACXVPMSEEDYSCALSNNFARSFHPM^ (SEQ ID 

NO:730), Also preferred are polynucleotide fragments encoding these polypeptide 
fragments. Furthermore, this gene maps to chromosome 6, and therefore, may be used 

35 as a marker in linkage analysis for chromosome 6 (See Accession No. N628%). 

This gene is expressed in numerous tissues including the heart, kidney, and 

brain. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, musculoskeletal disorders including Muscular Dystrophy and 
5 cardiovascular diseases. Similarly, polypeptides and antibodies directed to these 

polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or ceil type(s). For a number of disorders of the above tissues or cells, 
particularly of the muscle tissues, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 

10 tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid firom an individual not having the disorder. 

The tissue distribution and homology to dystrophin indicates that 

1 5 polynucleotides and polypeptides corresponding to this gene are useful for the 
diagnosis and treatment of Muscular Dystrophy and other muscle disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 166 
This gene is expressed primarily in human cerebellum. 

20 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of the central nervous system, including Alzheimer's Disease, 
Parkinson's Disease, ALS, and mental illnesses. Similarly, polypeptides and antibodies 

25 directed to these polypeptides are useful in providing inmiunological probes for 

differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the central nervous system, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., senun, plasma, urine, 

30 synovial fluid or spinal fluid) or another tissue or cell sample taken firom an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 399 as residues: Pro-20 to Gly-26, Leu-37 to Pro42, His-57 to Gly-63. 

35 The tissue distribution indicates that the protein products of this gene are useful 

for treatment/diagnosis of diseases of the central nervous system and may protect or 
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enhance survival of neuronal cells by slowing progression of neurodegenerative 
diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 167 
5 Preferred polypeptides encoded by this gene comprise the following amino acid 

sequence: 

MKmCGNYLAPSHSESSRRCCLLCFYPLCLEINFGMKVFl^MP 
SLIQED (SEQ ID NO:73 1). Polynucleotides encoding such polypeptides are also 
provided. This gene is believed to reside on chromosome 15. Therefore polynucleotides 

10 derived from this gene are useful in linkage analysis as chromosome 15 markers. 

This gene is expressed primarily in human testes tumor and to a lesser extent in 
normal human testes. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

15 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of the testes, particularly cancer, and other reproductive 
disorders. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing inmiunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or ceUs, particularly of 

20 the male reproductive tissues, expression of this gene at significandy higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 

25 fluid from an individual not having the disorder. 

The tissue distribution indicates that the protein products of this gene are useful 
for treatment/diagnosis of testicular diseases including cancers. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 168 

30 This gene is expressed primarily in fetal liver. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, conditions affecting hematopoietic development and metabolic diseases. 

35 Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
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hepatic system, and fetal hematopoietic system, expression of this gene at significantly 
higher or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
5 relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. Preferred epitopes include 
those comprising a sequence shown in SEQ ID NO: 401 as residues: His-7 to Trp-17, 
Leu-19 to Lys-27, Pro-33 to Gly-44, Lys-68 to GIy-74, Lys-85 to Cys-95. 

The tissue distribution indicates that the protein products of this gene are useful 
10 for treatment/diagnosis of diseases of the developing liver and hematopoietic system, 
and act as a growth differentiation factor for hematopoietic stem cells. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 169 

The polypeptide encoded by this gene is beheved to be a membrane bound 
1 5 receptor. The extracellular domain of which is expected to consist of the following 
amino acid sequence: 

RILLVKYSANEENKYDYIJTIWVCSELVKLWC^ 

ASWKEFSDFMKWSIPAFLYFIJDNLIVFYX^ 

LKXRLNWIQWASliTIJI^IVALTAGTKTLQHNlJVGRGF^^ 

20 FRNECTRKDNCTAKEWTFPEAKWNTTARWSHIRLGM 

YNEKILKEGNQLTEXIHQNSKLYFFGILFNGLTLGLQRSNRDQIKNCGFFYGH 
S (SEQ ID NO:732). Thus, preferred polypeptides encoded by this gene comprise the 
extracellular domain as shown above. It will be recognized, however, that deletions of 
either end of the extracellular domain up to the first cysteine from the N-terminus and 

25 the first cysteine of the C-terminus, is expected to retain the biological functions of the 
fiiU-length extracellular domain because the cysteines are thought to be responsible for 
providing secondary structure to the molecule. Thus, deletions of one or more amino 
acids from either end (or both ends) of the extracellular domain are contemplated. Of 
course, fiirther deletions including the cysteines are also contemplated as useful as such 

30 polypeptides is expected to have immunological properties such as the ability to evoke 
and immune response. Polynucleotides encoding all of the foregoing polypeptides are 
provided. 

This gene is expressed primarily in human osteoclastoma and to a lesser extent 
in hippocampus and chondrosarcoma. 
35 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
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not limited to, cancers, particularly those of the bone and connective tissues. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
inununological probes for differential identification of the tissue(s) or ceU type(s). For 
a number of disorders of the above tissues or cells, particularly of the skeletal system. 
5 expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
10 disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 402 as residues: Met-1 to Cys-6, Ala-41 to Tyr-49, Lys-76 to Lys-84. 

The tissue distribution indicates that the protein products of this gene are useful 
for diagnosis of cancers of the bone and connective tissues, and may act as growth 
factors for cells involved in bone or connective tissue growth. 

15 

FEATURES OF PROTEIN ENCODED BY GENE NO: 170 

Preferred polypeptides encoded by this gene comprising the following amino 
acid sequence: 

NSWNLQTIJ^VLTEAIGPEPAIPRXPREPPVATSTPATPSAGPQPIJTGT^ 

20 LVPGGPAPPCLGEAWALlJJ>PCRPSLTSCFWSPRre (SEQ ID 

NO:733). Polynucleotides encoding such polypeptides are also provided herein. 
This gene is expressed primarily in hematopoietic progenitor cells. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

25 biological sanq)Ie and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of the blood including cancer and autoinmiune disorders. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 

30 blood/circulatory system, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 

35 fluid from an individual not having the disorder. Preferred epitopes include those 

comprising a sequence shown in SEQ ID NO: 403 as residues: Gln-4 to His- 10, Pro-25 
to His-32. 
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The tissue distribution indicates that the protein products of this gene are useful 
for diagnosis of diseases involving growth differentiation of hematopoietic cells. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 171 
5 Preferred polypeptides encoded by this gene comprise the following amino acid 

sequences: ALQU^FYPDAVEEWl^ENVHPSLQRLQXLLQDLSEVS APP (SEQ ID 
NO:734); and/or CHPPALAGTLLRTPEGRAHARGLLLEAGGA (SEQ ID NO:735). 
Polynucleotides encoding such polypeptides are also provided. The protein product of 
this gene shares sequence homology with metallothionines. Thus, polypeptide encoded 

10 by this gene are expected to have metallothionine activity, such activities are known in 
the art and described elsewhere herein. 

This gene is expressed primarily in kidney cortex. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

1 5 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of the kidney including cancer and renal dysfunction. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the renal system, 

20 expression of this gene at signiftcandy higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and woimded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

25 disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 404 as residues: Ser-47 to Gln-52. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are usefiil for treatment/diagnosis of diseases of the kidney 
including kidney failure. 

30 

FEATURES OF PROTEIN ENCODED BY GENE NO: 172 

This gene is expressed primarily in 12 week old early stage human. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
35 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental abnormalities. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
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differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the developing embryo, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
5 synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 405 as residues: Gln-31 to Thr-43, Gly-51 to Ser-58, Pro-65 to Pro-72. 
10 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for treatment/diagnosis of developmental 
problems with fetal tissue. The gene may be involved in vital organ development in the 
early stage, especially hematopoiesis, cardiovascular system, and neural development. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 173 

The translation product of this gene shares sequence homology with TGN38, an 
integral membrane protein previously shown to be predominantly localized to the trans- 
Golgi network (TGN) of cells. 

This gene is expressed primarily in developing embryo and to a lesser extent in 

20 cancer tissues including lymphoma, endometrial, protate and colon. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or ceU type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental abnormalities and cancer. Similarly, polypeptides and 

25 antibodies directed to these polypeptides are usefid in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the developing fetus, expression of this 
gene at significantiy higher or lower levels may be routinely detected in certain tissues 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 

30 synovial fluid or spinal fluid) or another tissue or cell sample taken firom an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 406 as residues: His-65 to Ser-72, Pro-82 to Gly-91, Pro-98 to Glu-1 18, Ser-126 

35 to Gly-166, Pro-180 to Asp-188, Tyr-209 to Lys-214, Ghi-220 to Leu-228. 

The tissue distribution and homology to an integral membrane protein indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for 
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diagnosis of cancers and developmental abnormalities where aberrant expression relates 
to an abnormality. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 174 
5 The translation product of this gene shares sequence homology with a dnaJ heat 

shock protein from E. coli which is allelic to sec63, a gene that affects transit of nascent 
secretory proteins across the endoplasmic reticulum in yeast. 

This gene is expressed primarily in Hodgkin*s lymphoma and to a lesser extent 
in testes. 

10 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancer. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 

15 of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 

20 the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 407 as residues: Thr-13 to Trp-21, Arg- 
74 to Asp-81. 

The tissue distribution and homology to dnaJ indicates that polynucleotides and 
25 polypeptides corresponding to this gene are useful as a diagnostic for cancer including 
Hodgkin's lymphoma. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 175 

This gene is expressed primarily in endothelial cells and to a lesser extent in 

30 bone marrow stromal cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases involving angiogenic abnormalities including diabetic 

35 retinopathy, macular degeneration, and other diseases including arteriosclerosis and 
cancer. Similarly, polypeptides and antibodies directed to these polypeptides are useful 
in providing inmiunological probes for differential identification of the tissue(s) or cell 
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type(s). For a number of disorders of the above tissues or cells, particularly of the 
vascular system, expression of this gene at significandy higher or lower levels may be 
roudnely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
5 cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that the protein products of this gene are useful 
for treating diseases where an increase or decrease in angiogenesis is indicated and as a 
10 factor in the wound healing process. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 176 

The translation product of this gene shares sequence homology with MATS 
(mouse) which is thought to be important in reguladng chloride conductance in cells 

1 5 (particularly in the breast) by modulating the response mediated by cAMP and protein 
kinase C to extracellular signals. 

This gene is expressed primarily in amniotic cells and hematopoeitic ceUs 
including macrophages. Neutrophils, T cells, TNF induced aortic endotheliimi and to a 
lesser extent in testes, TNF induced epithelial cells, and smooth muscle. 

20 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammatory responses mediated by T cells, macrophages, and/or 
neutrophils particularly those involving TNF, and also cancer. Similarly, polypeptides 

25 and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the inunune system, expression of 
this gene at significandy higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 

30 urine, synovial fluid or spinal fluid) or another tissue or cell sample taken fix)m an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 409 as residues: Thr-19 to Ala-33, Leu-54 to Asp-82, Pro-89 to Ala-97, Pro-100 

35 to Lys-125, Ser-127 to Phe-135, Gly-164 to Leu-169, Cys-173 to Arg-178. 

The tissue distribution and homology to mat-8 indicates that polynucleotides and 
polypeptides corresponding to this gene are usefiil for modifying inflammatory 
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responses to cytokines such as TNF and thus modifying the duration and/or severity of 
inflammation. Polynucleotides and polypeptides derived from this gene are thought to 
be useful in the diagnosis and treatment of cancer. 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 177 

This gene is expressed primarily in endothelial cells. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

10 not limited to, vascular restenosis. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the vascular system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 

15 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid firom an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

20 corresponding to this gene are useful for treating diseases associated with vascular 
response to injury such as vascular restenosis following angioplasty.. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 178 

One embodiment of the claimed invention comprises: 
25 MRPDWKAGAGPGGPPQKPAPSSQRKPPARPSAAAAAIAVAAAEEERRU^ 
RUU^EEDKPAVERCXEELWGDVENDEDAIiRRIJlGPRVQEH^ 
KGNFPPQKKPVWVDEEDEDEEMVDMNINNRFRKDMMK^ 
RUOEEFQHAMGGWAWAETTKRKTSSDDESEEDEDDIXQRTGNHSTS 
ILKMKNCQHANAERPTVARISICAVPSRCITO)^^ (SEQ ID NO:737); or 
30 CLEELWGDVENDEDAIJLRRLRGPRVQEHEDSGDSEVENEAKGI^ 
W\n3EEDEDEEMVDMMNNRFRKDMMKNASESKIJS^ 
GVPAWAETTKRKTSSDDESEEDEDDliQRTGNnSTSTSLPRGILKMKNCQHA 
NAERPTVARISICAVPSRCTDCDGC (SEQ ID NO: 738). LKEKIVRSFEVSPDGS 
FLLINGL\GYlJIIiAMKTKEUGSMKmGR 
35 WDVNSRKCLNRFVDEGSLYG1^IATSRNGQYVACGSNCGV\W 
Tl^KPIKAIMNLVTGVTSLTFNPTIEILAIASEI^^ 

KNKMSHVHTMDFSPRSGYFALGNEKGKAIJV^^ (SEQ ID NO:739); 
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and/or KINGRVAASTFSSDSKXVYASSGDGEVYVWD\^SRK(XN 
YGl^IATSRNGQYVACGSNCGVVMYNQDSCLQETM'mKAIN^ 
FNPTTEILAIASEmKEAVRLVHLPSCTWSNFPVI^ 
YFALGNEKGKAL (SEQ ID NO:740). 
5 This gene is expressed primarily in epidydimus and endometrial tumors and to a 

lesser extent in T cell lymphoma and cell lines derived from colon cancer. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

10 not limited to, tumors of the reproductive organs including testis and endometrial cells. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing iiiununological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
reproductive system, expression of this gene at significantly higher or lower levels may 

15 be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Preferred epitopes include those comprising a 

20 sequence shown in SEQ ID NO: 411 as residues: Ser-67 to Lys-72, Val-87 to Leu-93, 
Tyr-128 to Pro-141, Asp-204 to Gly-210. 

The tissue distribution indicates that the protein products of this gene are usefiil 
for treating tumors of the endonnetrium or epithelial tumors of the reproductive system. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 179 

Ptefeired polypeptides encoded by this gene comprise the foUowing amino acid 
sequence: 

MRILQUUJ^TGLVGGETRIIKGFECKLHSQPWQAALraKTRlljCGATLIAPR 
WIiTAAH(XKPRYIVHU3QHNLQKEEGCEQTRTATESFPHPGFNNSLPNKDH 

30 RNDIMLVKMASPVSITWAVRPLTLSSRCVTAGTSCSFPAGAARPDPSYAaLT^ 
DAPTSPSIJSTRSWTPTPATSQTPWCWACRKGARTPARVTPGALWSVTSLFKA 
LSPGARIRVRSPESLVSTRKSANMWTGSRRR (SEQ ID NO:741); ETRnKGFEC 
miSQPWQAAIJEKTRlirGATLIAPRWLLTAAHCLKPRYIVHLGQHNLQKEE 
GCEQTRTATESFPHPGFNNSLPNKDHRNDIMLVKMASPVSrrWAVRPLTLSSR 

35 CVTAGTSCSFPAGAARPDPSYACLTPCDAPTSPSLSTRSVRTPTPATSQTPWCVP 
AOUCGARTPARVTPGALWSVTSUTCAl^PGARIRVRSPESLVSTRKSANMW^ 
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SRRR (SEQ ID NO:742); or CKLHSQPWQAALFEKTRLLCGATLIAPRWLLT 
AAH(XKPRYIVHLGQHNLQKEEGCEQTRTATESFPHPGFNS 
(SEQ ID NO:743). The translation product of this gene shares sequence homology 
with neuropsin a novel serine protease which is thought to be important in modulating 
extracellular signaling pathways in the brain. Owing to the structural similarity to other 
serine proteases the protein products of this gene are expected to have serine protease 
activity which may be assayed by methods known in the art and described elsewhere 
herein. 

This gene is expressed primarily in endometrial tumor and to a lesser extent in 
colon cancer, benign hypertrophic prostate, and thymus. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but arc 
not limited to, cancers of the endometrium or colon and benign hypertrophy of the 
prostate. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing inununological probes for differential identification of the tissiie(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the urogenital or reproductive systems, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g.. serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 412 as residues: Gly-12 to Ser-22, Pro- 
34 to Ser-53. 

The tissue distribution and homology to serine proteases indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosing 
or treating hperproliferative disorders such as cancer of the endometrium or colon and 
hyperplasia of the prostate. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 180 

Preferred polypeptide encoded by this gene comprise the following amino acid 

sequence: VLQGRYFSPILEMRRLRPEGXXNLPGGSRAQKEPRQDLTLVLWPHC 

PHFA^^mSYVPTKQCMV(y^SFYCIFIFKGPVQ^ (SEQ ID NO:744). 

Polynucleotides encoding such polypeptide are also provided. 
This gene is expressed primarily in fetal brain 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, identifying and expanding stem cells in the CNS. Similarly, polypeptides 
5 and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the nervous system, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
10 urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that the protein products of this gene are useful 
15 for detecting and expanding stem cell populations in the (or of the) central nervous 
system, 

FEATURES OF PROTEIN ENCODED BY GENE NO: 181 

This gene is expressed primarily in early stage hunoan brain and a stromal cell 

20 line. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or ceU type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental abnormalities of the CNS. Similarly, polypeptides and 

25 antibodies directed to these polypeptides are useful in providing inmiunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the central nervous system, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 

30 urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 414 as residues: Gln-42 to Gki-47, Gln-54 to Pro-60. 

35 The tissue distribution indicates that the protein products of this gene play a role 

in the development of the central nervous system. Therefore this gene and its products 
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are useful for diagnosing or treating developmental abnormalities of the central nervous 
system. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 182 
5 Preferred polypeptides encoded by this gene comprise the following amino acid 

sequence: 

MPfflXJWPELHDFMQSAEVGTIFALSWLnWFGH^ 
FLACHPIJ^IYFAA\aVLYREQEVIJXX>CDM^ 

TFU^SFPHPNLLGRPLPNSKLRGRQPLLSKTLSWHQPSRGLIWCCGSGXRGLL 

10 RPEDRTKDVLTKPRTNRFVKIJ^VMGLTVALGAAALAVVKSAI^^ 

FP (SEQ ID NO:745); or CPEFFIPATLPCPFVFAFTSEASSRAYLTQRGPGGLAQ 
NIJ4PLPVGFWMGSLPPPWCWRKWVSEACSCFC (SEQ ID NO:746) These 
polypeptides are structurally similar to various TGF-beta family members. Thus, this 
polypeptide is expected to have a variety of activities in the modulation of cell growth 

15 and proliferation. 

This gene is expressed primarily in osteoclastoma, microvascular endothelium, 
and bone marrow derived cell lines. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, hematological diseases particularly involving aberrant proliferation of 
stem cells. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 

25 the inmiune system, expression of this gene at significantly higher or lower levels may 
be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., senmi, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sanq)le taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 

30 individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 415 as residues: Ser-33 to Ala-39. 

The tissue distribution indicates that the protein products of this gene is useful 
for treating disorders of the progenitors of the inunune system. Apphcations include in 
vivo expansion of progenitor cells, ex vivo expansion of progenitor cells, or the 

35 treatment of tumors of the circulatory system, such as lymphomas. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 183 

This gene maps to chromosome 17 and therefore, polynucleotides of the 
invention can be used in linkage analysis as a marker for chromosome 17. In specific 
embodiments, polypeptides of the invention comprise the sequence: 
5 GFGS VSAAGRRSGGTWQPVQ (SEQ ID NO:747); PGGLAVGSRWWSRSLT 
(SEQ ID NO:748); LEPSRQRRPRRRGGTSRPETDQRAKCWRQL (SEQ ID 
NO:749); and/or VCLRCQNRMEN (SEQ ID NO:750). hi further specific 
embodiments, polypeptides of the invention comprise the sequence: MAACTARRPGR 
GQPLVWVADXGPVAKAALCAAXAGAFSPASTTTTRRHI^SRNRFEGKVLETV 

10 GVFEWKQNGKYETGQU^HSIFGYRGVVLJT>WQARLXD^ 

PAGHGSKEVKGKTHTYYQVUDARDCTfflSQRSQTEAVTFLi^^ 
GLDYVSHEDIU>YTSTDQWIQHELFERFIXYDQTKAPPFVARET^ 
PWLEI^DVHRETmNIRVTVIPFYMGMREAQNSHVYWW 
UlERHWRIFSI^GTLETVRGRGWGREPVl^KEQPAFQYSSHVSLQAS^^ 

15 GTFRFERPDGSHFDVRIPPFSLESNKDEKTPPSGLHW (SEQ ID NO:751); 
MAACTARRPGRGQPLVVPVADXGPVAKAALCAA (SEQ ID NO:752); 
VLETVGX^WKQNGKyETGQLFUiSIFGYRGVV^ (SEQ ID NO:757); 
GLDYVSHEDILPYTST (SEQ ID NO:758); DVHRETTENIRVTVIPFYM (SEQ ID 
NO:759); WWRYORLENLDSDVVQLRER (SEQ ID NO:760); and/or PAFQYSS 

20 HVSLQASSGHMWGTFRFER (SEQ ID NO:761). Polynucleotides encodmg these 
polypeptides are also encompassed by the invention. 

This gene is expressed primarily in gall bladder, prostate, and fetal brain, and to 
a lesser extent in a few tumor and fetal tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

25 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not Umited to, growth related disorders such as cancers. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 

30 of the above tissues or cells, particularly of the prostate, gaU bladder, and fetal brain, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken fiom 
an individual having such a disorder, relative to the standard gene expression level, i.e., 

35 the expression level m healthy tissue or bodily fluid from an individual not having the 
disorder. 
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The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of growth-related 
disorders, such cancers. 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 184 

In specific embodiments, polypeptides of the invention comprise the 
sequence:SLCCPEGAEGC (SEQ ID NO:762) and/or QLKKTHYDRPCP (SEQ ID 
NO:763). Polynucleotides encoding these polypeptides are also encompassed by the 
invention. 

10 This gene is expressed primarily in stromal cell, tonsil, and glioblastoma and to 

a lesser extent in some other tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or ceU type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

15 not limited to, immune and inflammatory disorders and glioblastoma. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the stromal cells, 
tonsil, and glioblastoma expression of this gene at significantly higher or lower levels 

20 may be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid fifom an individual not having the disorder. Additionally, it is believed that the 

25 product of this gene regulates pancreatic cell differentiation into beta cells. Accordingly, 
polynucleotides and polypeptides of the invention arc useful in the treatment of insulin- 
dependent diabetes mellitus and associated conditions e.g. pancreatic hypofiinction and 
the prevention, as weD as the treatment of undifferentiated type pancreatic cancers. 
Preferred epitopes include those comprising a sequence shown in SEQ ID NO: 417 as 

30 residues: Pra-27 to Ala-32. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of inunune and 
inflammatory disorders and glioblastoma. 

35 FEATURES OF PROTEIN ENCODED BY GENE NO: 185 

This gene is expressed primarily in hepatocellular carcinoma and to a lesser 
extent in other tissues. 
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Therefore, polynucleotides and polypeptides of the invention are usefiil as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, liver diseases. Similarly, polypeptides and antibodies directed to these 
5 polypeptides are usefiil in providing inmunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the Uver, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 

10 tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 418 as residues: Gly-32 to Lys-39. 
The tissue distribution indicates that polynucleotides and polypeptides 

1 5 corresponding to this gene are useful for diagnosis and treatment of liver diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 186 

This gene is expressed primarily in hippocampus and to a lesser extent in other 

tissues. 

20 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neutronal disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 

25 identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the hippocampus, expression of this gene at significantly 
higher or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sanq>le taken firom an individual having such a disorder, 

30 relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are usefiil for diagnosis and treatment of neuronal disorders. 



35 



FEATURES OF PROTEIN ENCODED BY GENE NO: 187 

This gene is expressed primarily in bone cancer and hippocampus and to a 
lesser extent in osteoclastoma and other tissues. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, bone-related disorders and neuronal diseases. Similarly, polypeptides 

5 and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the bone, ostoeclast, and 
hippocampus, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 

10 fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid fix)m an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

1 5 corresponding to this gene are useful for diagnosis and treatment of bone-related 
disorders and neuronal diseases. 



FEATURES OF PROTEIN ENCODED BY GENE NO: 188 

This gene maps to chromosome 4 and therefore polynucleotides of the invention 
20 can be used in linkage analysis as a marker for chromosome 4. 

This gene is expressed primarily in neuronal tissues such as hippocampus, 

spinal cord, and hypothalamus and to a lesser extent in a few other tissues such as 

ovary. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
25 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neuronal diseases. Similarly, polypeptides and antibodies directed to 
these polypeptides are usefiil in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
30 tissues or cells, particularly of the neuronal tissues, expression of this gene at 

significantiy higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or ceU sample taken fix)m an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
35 in healthy tissue or bodily fluid ftom an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of neuronal disorders. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 189 

This gene maps to chromosome 10, therefore, polynucleotides of the invention 
can be used in linkage analysis as a marker for chromosome 10. 
5 This gene is expressed primarily in neuronal tissues and inunune tissues, and to 

a lesser extent in a few other tissues such as skin tumor, lung etc. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

10 not limited to, neuronal and immune-related disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the neuronal and immune-related tissues, 
expression of this gene at significantly higher or lower levels may be routinely detected 

15 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individiial not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 

20 NO: 422 as residues: Pra-19 to Asi>-25. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of neuronal and 
immune-related disorders. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 190 

The translation product of this gene shares sequence homology with human 
N33, a gene located in a homozygously deleted region of human metastatic prostate 
cancer which is thought to be important in prevention of prostate cancer. In specific 
embodiments, polypeptides of the invention comprise the sequence: 

30 AQRKKEMVI^EKVSQIJSIEWTNKRPVIRMNGDK^^ 
UJUiRQCWCKQADEEFQILANSWRYSSAFIT^nU^ 
NSAPTFINFPAKGKPKRGDTYEUJWGFSAEQIARWIADRTDVNmV^ 
ARWRFWCVSVT (SEQ ID NO:765); MWALLIVCI>VPSAS (SEQ ID NO:766); 
AQRKKEMVLSEKVSQL (SEQ ID NO:767); MEWTNKRPVIRMNGDKF (SEQ 

35 ID:768); RIU.VKAPPRNYSVIVMFrAU2LHR(X^CKQAD^ 

SSAFTNRIFFA (SEQ ID NO:769); MVDFDEGSDWQMLNMNSAPTFINFPAK 
GKP (SEQ ID NO:770); KRGDTYELQVRGFSAEQIARWIADRTDVNIRVIRPPN 
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(SEQ ID NO:771); and/or YAGPU^IUjIJJJ^VIGGLVYLRRVIWNFSI^^ 
CVLCLL (SEQ ID NO:772). Polynucleotides encoding these polypeptides are also 
encompassed by the invention. 

This gene is expressed primarily in infant adrenal gland prostate cell line and to 
5 a lesser extent in a few other tissues like liver, smooth muscle etc. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, prostate cancer and endocrine disorders. Similarly, polypeptides and 

10 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the prostate and adrenal gland, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 

15 urine, synovial fluid or spinal fluid) or another tissue or cell sample taken fix>m an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those conq)rising a sequence shown in SEQ ID 
NO: 423 as residues: Pro-34 to Gly-*3, Arg-1 13 to Pro-120. 

20 The tissue distribution and homology to N33 mdicates that polynucleotides and 

polypeptides corresponding to this gene are useful for diagnosis and treatment for 
prostate cancer and endocrine disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 191 
25 This gene is expressed primarily in T ceU and to a lesser extent in fetal lung. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological san5)le and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune disorders. Similarly, polypeptides and antibodies directed to 
30 these polypeptides are useful in providing inmiunological probes for differential 

identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the inmiune, expression of this gene at significantly 
higher or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
35 fluid) or another tissue or cell sample taken finom an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
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or bodily fluid firom an individual not having the disorder. Preferred epitopes include 
those co^^)^ising a sequence shown in SEQ ID NO: 424 as residues: Trp-3 to Phe-9. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of immune disorders. 

5 

FEATURES OF PROTEIN ENCODED BY GENE NO: 192 

This gene maps to chromosome 6, therefore, polynucleotides of the invention 
can be used in linkage analysis as a marker for chromosome 6. Neural activity and 
neurotrophins induce synaptic remodeling in part by altering gene expression. This 

10 gene is believed to be a glycosylphoshatidylinositol-anchored protein encoded by a 
hippocampal gene and to possess neural activity. This molecule is believed to be 
expressed in postmitotic-differentiating neurons of the developing nervous system and 
neuronal structures associated with plasticity in the adult. Message of this gene is 
believed to be induced by neuronal activity and by the activity-regulated neurotrophins 

15 BDNF and NT-3. The product of this gene is believed to stimulate neurite outgrowth 
and arborization in primary embryonic hippocampal and cortical cultures and to act as a 
downstream effector of activity-induced neurite outgrowth. In specific embodiments, 
polypeptides of the invention comprise the sequence: DAVFKGFSDCLLKLGDS (SEQ 
ID NO:773); CQEGAKDMWDKLRKESKNLN (SEQ ID NO:774); 

20 VLLVSLSAALATWLSF (SEQ ID NO:775); MGLKLNGRYISULAVQIAYLVQAVR 
AAGKCDAVFKGFSDCLLKLGDS (SEQ ID NO:776); PAAWDDKTNIKTVCTYW 
EDFHSCIVrALTDCQEGAKDMWDKUlKESKN^ 

LPAFPVLLVSLSAALATWLSF (SEQ ID NO:777); and/or MGLKLNGRYISULA 
VQIAYLVQAVRAAGKCDAVFKGFSDCLUOjGDSXXXXXPAAWDDK^^ 

25 TYWEDFHSCTVTALTDCQEGAKDMWDKUUCESKNI^ 

GSIXPAFPVLLVSLSAALATWLSF(SEQIDNO:778). Polynucleotides encoding 
this polypeptide are also encompassed by the invention. 

This gene is expressed primarily in human placenta, endometrial tumor and 
tissues of the central nervous system (CNS). 

30 Therefore, polynucleotides and polypeptides of the invention are usefid as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, relating to reproductive disorders, cancers and neurological diseases. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 

35 providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a niunber of disorders of the above tissues or cells, particularly of the 
reproductive and neurological disorders, expression of this gene at significanUy higher 
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or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
5 or bodily fluid from an individual not having the disorder. Preferred epitopes include 
those comprising a sequence shown in SEQ ID NO: 425 as residues: Asp-47 to Asp- 
63, His-75 to Tyr-80, Pro-83 to Tyr-89. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of reproductive 
10 disorders such as endometrial tumors. Expression of this gene in tissues of the CNS 
and its strong homology to Neuritin suggest that the protein product from this gene may 
also be used in the treatment and diagnosis of neurological disorders and in the 
regeneration of neural tissues, e.g., following injury. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 193 

The translation product of this gene shares sequence homology with tenascin 
which is thought to be important in development The translation product of this gene is 
believed to be a ligand of the fibroblast growth factor family. FGF ligand activity is 
known in the art and can be assayed by methods known in the art and disclosed 
20 elsewhere herein. 

This gene is expressed primarily in endometrial tumors, and other types of 
tumors. 

Therefore, polynucleotides and polypeptides of the invention are usefiil as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

25 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancers. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or ceU type(s). For a number of disorders of the above tissues or cells, 
particulariy of the cancer tissues, expression of this gene at significantly higher or lower 

30 levels may be routinely detected in certain tissues (e,g., cancerous and wounded 

tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 

35 comprising a sequence shown in SEQ ID NO: 426 as residues: Gly-29 to Glu-34, Arg- 
71 to Arg-76, Thr-176 to Cys-182, Gly-184 to Glu-199. 
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The tissue distribution and homology to tenascin indicates that polynucleotides 
and polypeptides corresponding to this gene are useful for diagnosis and treatment of 
cancers. 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 194 

In specific embodiments, polypeptides of the invention comprise the sequence: 
MNSAAGFSHLDRRERVLKLGESFEKQPRCASTLC (SEQ ID NO:779). 
Polynucleotides encoding these polypeptides are also encompassed by the invention. 
This gene is expressed primarily in fetal human lung and neutrophils. 

10 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, lung development and respiratory disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 

1 5 for differential identification of the tissue(s) or cell type(s). For a niunber of disorders 
of the above tissues or cells, particularly of the respiratory system, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 

20 having such a disorder, relative to the standard gene expression level, i.e., the 

expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution in fetal lung and neutrophils indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 
25 and treatment of lung and immunity related diseases, for example, lung cancer, viral, 
fungal or bacterial infections (e.g. lesions caused by tuberculosis), inflammation (e.g. 
pneumonia), metabolic lesions etc. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 195 

30 This gene is expressed primarily in breast lymph node. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immunal disorders. Similarly, polypeptides and antibodies directed to 

35 these polypeptides are useful in providing immunological probes for differential 

identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune system, expression of this gene at 
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significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
5 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are usefiil for the diagnosis and treatment of immunal 
disorders. 

10 FEATURES OF PROTEIN ENCODED BY GENE NO: 196 

This gene maps to chromosome 5 and accordingly, polynucleotides of the invention can 
be used in Unkage analysis as a maiicer for chromosome 5. The translation product of 
this gene shares sequence homology with human M-phase phosphoprotein 4 which is 
thought to be important in phosphorylation and signal transduction processes. In 

15 specific embodiments, polypeptides of the invention comprise the sequence: 
TIYPTEEELQAVQKIVSrrERALKLVSD (SEQ ID NO:780); RALKGVLRV 
GVIj\KGLLLRGDRNVNLVLLC (SEQ ID NO:781); ALAALRHAKWFQARAN 
GLQSCVniRILRDLCQRVPTWS (SEQ ID NO:782); GDALRRVFECISSGHL (SEQ 
ID NO:783); LAFRQIHKVLGMDPLP (SEQ ID NO:784); and/or TIYPTEEELQAVQ 

20 KWSnmALKLVSDSI^EHEKNKNKEGDDKKEGGKDRALKGVLRVGVLAKG 
LLLRGDRNVl^VLLCSEKPSKTLLSRIAEl^PKQLAVISPEK^^ 
NSCVEPKMQVTITLTSPIIREENMREGDVTSGMVKDPPDV^ 
HAKWFQARANGL(JSCVIIIRILRDIX:QR\a^ 

QSPGDALRRVFECISSGDLKGSPGIXDPCEKDPFimATMTDCXJI^ 

25 LRLLAFRQIHK\^GMDPU>QMSQRFNIHNNRKRRRD^ 

DYDNF (SEQ ID NO:785). Polynucleotides encoding these polypeptides are also 
encompassed by the invention. 

This gene is expressed primarily in Human Hippocampus and to a lesser extent 
in Prostate, Human Frontal Cortex. 

30 Therefore, polynucleotides and polypeptides of the invention are usefril as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sanq)le and for diagnosis of diseases and conditions, which include, but are 
not limited to, disorders related to reproductive system and nervous system. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 

35 immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or ceDs, particularly of the reproductive 
system and nervous system, expression of this gene at significantly higher or lower 
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levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, luine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
5 fluid from an individual not having the disorder. 

The tissue distribution and homology to human M-phase phosphoprotein 4 
indicates that polynucleotides and polypeptides corresponding to this gene are useful for 
the diagnosis and treatment of reproductive and nervous system disorders. 

10 FEATURES OF PROTEIN ENCODED BY GENE NO: 197 

In specific embodiments, polypeptides of the invention comprise the sequence: 
MGSQHSAAARPSS(3UIKQEDDRDG (SEQ ID NO:786); 
LLAEREQEEAIAQFPYVEFTGRDSrrCLTC (SEQ ID NO:787); and/or 
(JGTGYIPTEQVNELVALIPHSDQRLRPQRTKQYV (SEQ ID NO:788). 

15 Polynucleotides encoding these polypeptides are also encompassed by the invention. 

This gene is expressed primarily in Human Primary Breast Cimcer and to a 
lesser extent in Human Adult Spleen, Hodgkin's Lymphoma I, Salivary Gland. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not lirnited to, cancer and inununal disorders. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or ceU type(s). For a number of disorders of 
the above tissues or cells, particularly of the cancer and immune system, expression of 

25 this gene at significantly higher or lowar levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue <m: bodily fluid fix)m an individual not having the 

30 disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 430 as residues: Ser-126 to Gly-138. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene arc usefiil for the diagnosis and treatment of cancer and 
inununal disorders. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 198 

This gene is expressed primarily in monocytes. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present m a 
5 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, blood cell disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing inununological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the inmiune system, expression of this gene at 

10 significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

15 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are usefiil for the diagnosis and treatment of blood cell 
disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 199 

20 This gene is expressed primarily in Human Ovary and Synovia and to a lesser 

extent in Human 8 Week Whole Embryo. 

Therefore, polynucleotides and polypeptides of the invention arc useful as 
reagents for differential identification of the tissue(s) or cell type(s) present m a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

25 not limited to, reproductive and developmental disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing inmiunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the reproductive and developmental system, 
expression of this gene at significanfly higher or lower levels may be routinely detected 

30 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken ft^om 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

35 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for the diagnosis and treatment of reprodiictive 
and developmental disorders. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 200 

This gene maps to chromosome 8 and therefore polynucleotides of the invention 
can be used in linkage analysis as a marker for chromosome 8. The translation product 
5 of this gene shares limited sequence homology with collagen proline rich domain. 
This gene is expressed primarily in CNS. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

10 not limited to, neurological diseases. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the nervous system, expression of this gene at 
significantly higher or lower levels nfiay be routinely detected in certain tissues (e.g., 

15 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown m SEQ ID NO: 433 as residues: 

20 Pro-35to Asp-41. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of neurological 
diseases. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 201 

Translation product of this gene shares homology with a mammalian histone 
Hla protein. One embodiment for this gene is the polypeptide fragments comprising the 
following amino acid sequence: ARLNVGRESLKREMLKSQGVKVSESPMGAR 
HSSWPEGAAFCKKVQGAQMQFPPRR (SEQ ID NO:789); ARLNVGRESLKR 

30 EML (SEQ ID NO:790); LKSC^GVKVSESPMGARHSSW (SEQ ID NO:791); 
AFCKKVQGAQMQFPPRR (SEQ ID NO:792). An additional embodiment is the 
polynucleotide fragments encoding these polypeptide (See Accession No. pirlS24178) 
fragments. 

This gene is expressed primarily in neutrophils. 
35 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or ceU type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
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not limited to, immune disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune system, expression of this gene at 
5 significandy higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or ceU sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

10 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for diagnosis and treatment of iimnune disorders. 
Since the gene is expressed in cells of lymphoid origin, the natural gene product may be 
involved in vital inmiune functions. Therefore it may be also used as an agent for 
inununological disorders including arthritis, asthma, immune deficiency diseases such 

1 5 as AIDS, and leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 202 
This gene is expressed primarily in neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

20 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inmiune disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 

25 tissues or cells, particularly of the immune system, expression of this gene at 

significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasnm, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken fi^m an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 

30 in healthy tissue or bodily fluid fix)m an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of immune disorders. 
Since the gene is expressed in cells of lymphoid origin, the natural gene product may be 
involved in immune functions. Therefore it may be also used as an agent for 

35 inmiunological disorders including arthritis, asthma, immime deficiency diseases such 
as AIDS, and leukemia. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 203 
This gene is expressed primarily in Neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, infectious disorders, immune disorders, and cancers. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 

1 0 a number of disorders of the above tissues or cells, particularly of the inmiune system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 

1 5 the expression level in healthy tissue or bodily fluid fi-om an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ED 
NO: 436 as residues: Thr-31 to Lys-36. 

The tissue distribution indicates that polynucleotides and px>lypeptides 
corresponding to this gene are useful for diagnosis and treatment of infectious 

20 disorders, inunune disorders, and cancers. Since the gene is expressed in cells of 
lymphoid origin, the natural gene product may be involved in inmiune functions. 
Therefore it may be also used as an agent for immunological disorders including 
arthritis, asthma, inunune deficiency diseases such as AIDS, and leukemia. Protein, as 
well as, antibodies directed against the protein may show utility as a tumor marker 

25 and/or immunotherz^y targets for the above listed tumors and tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 204 

This gene maps to chromosome 16 and therefore polynucleotides of the 

invention can be used in linkage analysis as markers for chromosome 16. The 
30 translation product of this gene shares sequence homology with lactate dehydrogenase 

which is thought to be important in lactate metabolism. 

This gene is expressed primarily in human tonsils and to a lesser extent in 

Spleen, and Neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
35 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 

not limited to, immune disorders, infectious disorders, and cancers. Similarly, 
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polypeptides and antibodies directed to these polypeptides are usefiil in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the inmiune 
disorders, infectious disorders, and cancers, expression of this gene at significantly 
5 higher or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. Preferred epitopes include 
10 those comprising a sequence shown in SEQ ID NO: 437 as residues: Gly-7 to Ser-12. 

The tissue distribution and homology to lactate dehydrogenase gene indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and treatment of inmiune disorders, infectious disorders, and cancers. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 205 

The translation product of this gene shares sequence homology with Gcapl 
protein which is developmentally regulated in brain. 

This gene is expressed primarily in placenta and endometrial tumor and to a 
lesser extent in several other tumors. 

20 Therefore, polynucleotides and polypeptides of the invention are usefiil as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, vasculogenesis/angiogenesis and tumorigenesis. Similarly, polypeptides 
and antibodies directed to these polypeptides arc useful in providing immunological 

25 probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the vascular system and tumors, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken fiom 

30 an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not havmg the 
disorder. 

The tissue distribution and homology to Gcapl protein indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 
35 and U:eatment of disorder or dysfunction of vascular system of tumorigenesis. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 206 

In specific embodiments, polypeptides of the invention comprise the sequence 
MPYAQWLAENDRFEEAQKAFHKAGRQREA (SEQ ID NO:799); 
VQVLEQLTNNAVAESRFNDAAYYYWMLSMQCLDIAQD (SEQ ID NO:794); 
5 PAQKDTMLGKFYHFQRLAELYHGYHAIHRHTEDP (SEQ ID NO: 795); 
FSVHRPETLFNISRFLLHSLPKDTPSGISKVKILFT (SEQ ID NO:800); 
LAKQSKALGAYRLARHAYDKLRGLYIP (SEQ ID NO:796); ARFQKSIELG 
TLTIRAKPFHDSEELVPLCYRCSTNN (SEQ ID NO: 797); and/or PLLNNLGNVC 
INCRQPHFSASSYDVLHLVEFYI£EGITDEEAISLIDLEV^ 
10 LPDSCG (SEQ ID NO:798). Polynucleotides encoding these polypeptides are also 
encompassed by the invention. 

This gene is expressed primarily in testes. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

15 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, male reproductive and endocrine disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disordm 
of the above tissues or cells, particularly of the reproductive and endocrine systems, 

20 expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid firom an individual not having the 

25 disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are usefid for treatment of male reproductive and endocrine 
disorders. 

30 FEATURES OF PROTEIN ENCODED BY GENE NO: 207 

This gene is expressed in fetal lung. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
35 not limited to, lung diseases such as cystic fibrosis. Similarly, polypeptides and 

antibodies directed to these polypeptides are usefid in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
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of the above tissues or cells, particularly of the respiratory system, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 

5 having such a disorder, relative to the standard gene expression level, i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 440 as residues: Tyr-49 to Cys-54. 

The tissue distribution indicates that polynucleotides and polypeptides 

10 corresponding to this gene are useful for detection and treatment of disorders associated 
with developing lungs particularly in premature infants where the lungs are the last 
tissues to develop. The tissue distribution indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for diagnosis and intervention of 
lung tumors since the gene may be involved in the regulation of cell division, 

15 particularly since it is expressed in fetal tissue. Protein, as well as, antibodies directed 
against the protein may show utility as a tumor maricer and immunotherapy targets for 
the above listed tumors and tissues. 
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Table 1 summarizes the information corresponding to each "Gene No." described 
above. The nucleotide sequence identified as "NT SEQ ID NO:X" was assembled from 
partially homologous ("overlapping") sequences obtained from the "cDNA clone ID" 
identified in Table 1 and, in some cases, from additional related DNA clones. The 
5 overlapping sequences were assembled into a single contiguous sequence of high 
redundancy (usually three to five overlapping sequences at each nucleotide position), 
resulting in a fmal sequence identified as SEQ ID NO:X. 

The cDNA Clone ID was deposited on the date and given the corresponding 
deposit number listed in "ATCC Deposit No:Z and Date." Some of the deposits contain 

10 multiple different clones corresponding to the same gene. "Vector" refers to the type of 
vector contained in the cDN A Clone ID. 

'Total NT Seq." refers to the total number of nucleotides in the contig identified 
by "Gene No." The deposited clone may contain all or most of these sequences, 
reflected by the nucleotide position indicated as "5' NT of Clone Seq." and the "3' NT 

1 5 of Clone Seq." of SEQ ID NO:X. The nucleotide position of SEQ ID NO:X of the 
putative start codon (methionine) is identified as "5* NT of Start Codon." Similarly , 
the nucleotide position of SEQ ID NO:X of the predicted signal sequence is identified as 
"5' NT of First AA of Signal Pep." 

The translated amino acid sequence, beginning with the methionine, is identified 

20 as "AA SEQ ID NO: Y," although other reading frames can also be easily translated 
using known molecular biology techniques. The polypeptides produced by these 
alternative open reading firames arc specifically contemplated by the present invention. 

The first and last amino acid position of SEQ ID NO: Y of the predicted signal 
peptide is identified as 'Tirst AA of Sig Pep" and "Last AA of Sig Pep." The predicted 

25 first amino acid position of SEQ ID NO: Y of the secreted portion is identified as 

"Predicted First AA of Secreted Portion." Rnally, the amino acid position of SEQ ID 
NO: Y of the last amino acid in the open reading firame is identified as "Last AA of 
ORE." 

SEQ ID NO:X and the translated SEQ ID NO: Y are sufficiently accurate and 
30 otherwise suitable for a variety of uses well known in the art and described further 
below. For instance, SEQ ID NO:X is usefiil for designing nucleic acid hybridization 
probes that will detect nucleic acid sequences contained in SEQ ID NO:X or the cDNA 
contained in the deposited clone. These probes will also hybridize to nucleic acid 
molecules in biological samples, thereby enabling a variety of forensic and diagnostic 
35 methods of the invention. Similarly, polypeptides identified from SEQ ID NO: Y may 
be used to generate antibodies which bind specifically to the secreted proteins encoded 
by the cDNA clones identified in Table 1 . 
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Nevertheless, DNA sequences generated by sequencing reactions can contain 
sequencing errors. The errors exist as misidentified nucleotides, or as insertions or 
deletions of nucleotides in the generated DNA sequence. The erroneously inserted or 
deleted nucleotides cause fiarae shifts in the reading frames of the predicted amino acid 
5 sequence. In these cases, the predicted amino acid sequence diverges from the actual 
amino acid sequence, even though the generated DNA sequence nfiay be greater than 
99.9% identical to the actual DNA sequence (for example, one base insertion or deletion 
in an open reading frame of over 1000 bases). 

Accordingly, for those applications requiring precision in the nucleotide 

10 sequence or the amino acid sequence, the present invention provides not only the 

generated nucleotide sequence identified as SEQ ID NO:X and the predicted translated 
amino acid sequence identified as SEQ ID NO: Y, but also a sample of plasmid DNA 
containing a hunfian cDNA of the invention deposited with the ATCC, as set forth in 
Table 1 . The nucleotide sequence of each deposited clone can readily be determined by 

1 5 sequencing the deposited clone in accordance with known methods. The predicted 
amino acid sequence can then be verified from such deposits. Moreover, the amino 
acid sequence of the protein encoded by a particular clone can also be directly 
determined by peptide sequencing or by expressing the protein in a suitable host cell 
containing the deposited human cDNA, collecting the protein, and determining its 

20 sequence. 

The present invention also relates to the genes corresponding to SEQ ID NO:X, 
SEQ ID NO: Y, or the deposited clone. The corresponding gene can be isolated in 
accordance with known methods using the sequence information disclosed herein. 
Such methods include preparing probes or primers from the disclosed sequence and 
25 identifying or amplifying the coiresponding gene from appropriate sources of genomic 
material. 

Also provided in the present invention are species homologs. Species 
homologs may be isolated and identified by making suitable probes or primers from the 
sequences provided herein and screening a suitable nucleic acid source for the desired 
30 homologue. 

The polypeptides of the invention can be prepared in any suitable manner. Such 
polypeptides include isolated naturally occurring polypeptides, recombinantly produced 
polypeptides, synthetically produced polypeptides, or polypeptides produced by a 
combination of these methods. Means for preparing such polypeptides are well 
35 understood in the art. 

The polypeptides may be in the form of the secreted protein, including the 
mature form, or may be a part of a larger protein, such as a fusion protein (see below). 
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It is often advantageous to include an additional amino acid sequence which contains 
secretory or leader sequences, pro-sequences, sequences which aid in purification , 
such as multiple histidine residues, or an additional sequence for stability during 
recombinant production. 
5 The polypeptides of the present invention are preferably provided in an isolated 

form, and preferably are substantially purified. A recombinantly product version of a 
polypeptide, including the secreted polypeptide, can be substantially purified by the 
one-step method described in Smith and Johnson, Gene 67:31-40 (1988). 
Polypeptides of the invention also can be purified from natural or recombinant sources 
10 using antibodies of the invention raised against the secreted protein in methods which 
are well known in the art. 

Si gnal Sequences 

Methods for predicting whether a protein has a signal sequence, as well as the 

15 cleavage point for that sequence, are available. For instance, the method of McGeoch, 
Vims Res. 3:271-286 (1985), uses the information from a short N-terminal charged 
region and a subsequent uncharged region of the complete (uncleaved) protein. The 
method of von Heinje, Nucleic Acids Res. 14:4683-4690 (1986) uses the information 
from the residues surrounding the cleavage site, typically residues -13 to +2, where +1 

20 indicates the aiiiino terniintis of the secreted protein. The accuracy of predicting the 

cleavage points of known mammalian secretory proteins for each of these methods is in 
the range of 75-80%. (von Heinje, supra.) However, the two methods do not always 
produce the same predicted cleavage point(s) for a given protein. 

In the present case, the deduced amino acid sequence of the secreted polypeptide 

25 was analyzed by a computer program called Signal? (Henrik Nielsen et al.. Protein 
Engineering 10: 1-6 (1997)), which predicts the cellular location of a protein based on 
the amino acid sequence. As part of this computational prediction of localization, the 
methods of McGeoch and von Heinje are incorporated The analysis of the amino acid 
sequences of the secreted proteins described herein by this program provided the results 

30 shown in Table 1. 

As one of ordinary skill would appreciate, however, cleavage sites sometimes 
vary fix)m organism to organism and carmot be predicted with absolute certainty. 
Accordingly, the present invention provides secreted polypeptides having a sequence 
shown in SEQ ID NO: Y which have an N-terminus begirming within 5 residues (i.e., -h 

35 or - 5 residues) of the predicted cleavage point Similarly, it is also recognized that in 
some cases, cleavage of the signal sequence from a secreted protein is not entirely 
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uniform, resulting in more than one secreted species. These polypeptides, and the 
polynucleotides encoding such polypeptides, are contemplated by the present invention. 

Moreover, the signal sequence identified by the above analysis may not 
necessarily predict the naturally occurring signal sequence. For example, the naturally 
5 occurring signal sequence may be further upstream from the predicted signal sequence. 
However, it is likely that the predicted signal sequence will be capable ofdirecting the 
secreted protein to the ER. These polypeptides, and the polynucleotides encoding such 
polypeptides, are contemplated by the present invention. 

10 Polynucleotide and Polypeptide Variants 

"Variant" refers to a polynucleotide or polypeptide differing from the 
polynucleotide or polypeptide of the present invention, but retaining essential properties 
thereof. Generally, variants are overall closely similar, and, in many regions, identical 
to the polynucleotide or polypeptide of the present invention. 

15 By a polynucleotide having a nucleotide sequence at least, for example, 95% 

"identical" to a reference nucleotide sequence of the present invention, it is intended that 
the nucleotide sequence of the polynucleotide is identical to the reference sequence 
except that the polynucleotide sequence may include up to five point mutations per each 
100 nucleotides of the reference nucleotide sequence encoding the polypeptide. In other 

20 words, to obtain a polynucleotide having a nucleotide sequence at least 95% identical to 
a reference nucleotide sequence, up to 5% of the nucleotides in the reference sequence 
may be deleted or substituted with another nucleotide, or a number of nucleotides up to 
5% of the total nucleotides in the reference sequence may be inserted into the reference 
sequence. The query sequence may be an entire sequence shown inTable 1, the ORF 

25 (open reading frame), or any fragement specified as described herein. 

As a practical matter, whether any particular nucleic acid molecule or 
polypeptide is at least 90%, 95%, 96%, 97%, 98% or 99% identical to a nucleotide 
sequence of the presence invention can be determined conventionally using known 
computer programs. A preferred method for deterroing the best overall match between 

30 a query sequence (a sequence of the present invention) and a subject sequence, also 
referred to as a global sequence alignment, can be determined using the FASTDB 
computer program based on the algorithm of Brutiag et al. (Comp. App. Biosci. (1990) 
6:237-245). In a sequence aligrmient the query and subject sequences are both DNA 
sequences. An RNA sequence can be compared by converting U's to T's. The result 

35 of said global sequence aligimient is in percent identity. Preferred parameters used in a 
FASTDB alignment of DNA sequences to calculate percent identiy are: 
Matrix=Unitary, k-tuple=4. Mismatch Penalty=l, Joining Penalty=30, Randomization 
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Group Length=0, Cutoff Score= 1 , Gap Penalty=5, Gap Size Penalty 0.05, Window 
Size=500 or the lenght of the subject nucleotide sequence, whichever is shorter. 

If the subject sequence is shorter than the query sequence because of 5' or 3' 
deletions, not because of internal deletions, a manual correction must be made to the 
5 results. This is becuase the FASTDB program does not account for 5' and 3' 
truncations of the subject sequence when calculating percent identity. For subject 
sequences truncated at the 5' or 3' ends, relative to the the query sequence, the percent 
identity is corrected by calculating the number of bases of the query sequence that are 5* 
and 3' of the subject sequence, which are not matched/aligned, as a percent of the total 

10 bases of the query sequence. Whether a nucleotide is matched/aligned is determined by 
results of the FASTDB sequence alignment. This percentage is then subtracted from 
the percent identity, calculated by the above FASTDB program using the specified 
parameters, to arrive at a final percent identity score. This corrected score is what is 
used for the purposes of the present invention. Only bases outside the 5' and 3' bases 

15 of the subject sequence, as displayed by the FASTDB alignment, which are not 

noatched/aligned with the query sequence, are calculated for the purposes of manually 
adjusting the percent identity score. 

For example, a 90 base subject sequence is aligned to a 100 base query 
sequence to determine percent idoitity . The deletions occur at the 5' ead of the subject 

20 sequence and dierefore, the FASTDB alignment does not show a matched/alignement of 
the first 10 bases at 5' end. The 10 unpaired bases represent 10% of the sequence 
(number of bases at the 5' and 3' ends not matched/total number of bases in the query 
sequence) so 10% is subtracted from the percent identity score calculated by the 
FASTDB program. If the remaining 90 bases were perfectly matched the final percent 

25 identity would be 90%. In another example, a 90 base subject sequence is compared 
with a 100 base query sequence. This time the deletions are internal deletions so that 
there are no bases on the 5* or 3* of the subject sequence which are not matched/aligned 
with the query. In this case the percent identity calculated by FASTDB is not manually 
corrected. Once again, only bases 5' and 3' of the subject sequence which are not 

30 matched/aligned with the query sequnce are manually corrected for. No other manual 
corrections are to made for the purposes of the present invention. 

By a polypeptide having an amino acid sequence at least, for example, 95% 
"identical" to a query amino acid sequence of the present invention, it is intended that 
the amino acid sequence of the subject polypeptide is identical to the query sequence 

35 except that the subject polypeptide sequence may include up to five amino acid 

alterations per each 100 amino acids of the query amino acid sequence. In other words, 
to obtain a polypeptide having an amino acid sequence at least 95% identical to a query 
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amino acid sequence, up to 5% of the amino acid residues in the subject sequence may 
be inserted, deleted, (indels) or substituted with another amino acid. These alterations 
of the reference sequence may occur at the amino or carboxy terminal positions of the 
reference amino acid sequence or anywhere between those terminal positions, 
5 interspersed either individually among residues in the reference sequence or in one or 
more contiguous groups within the reference sequence. 

As a practical matter, whether any particular polypeptide is at least 90%, 95%, 
96%, 97%, 98% or 99% identical to, for instance, the amino acid sequences shown in 
Table 1 or to the amino acid sequence encoded by deposited DN A clone can be 

10 determined conventionally using known computer programs. A preferred method for 
determing the best overall match between a query sequence (a sequence of the present 
invention) and a subject sequence, also referred to as a global sequence aligrunent, can 
be determined using the FASTDB computer program based on the algorithm of Bmtlag 
et al. (Comp. App. Biosci. (1990) 6:237-245). In a sequence alignment the query and 

15 subject sequences are either both nucleotide sequences or both amino acid sequences. 
The result of said global sequence aligrunent is in percent identity. Preferred parameters 
used in a FASTDB amino acid alignment are: Matrix=PAM 0, k-tuple=2. Mismatch 
Penalty=l, Joining Penal ty=20. Randomization Group Length=0, Cutoff Score=l, 
Window Size=sequence length. Gap Penalty=5, Gap Size Penalty=0.05, Window 

20 Size=500 or the length of the subject amino acid sequence, whichever is shorter. 

If the subject sequence is shorter than the query sequence due to N- or C- 
terminal deletions, not because of internal deletions, a manual correction must be made 
to the results. This is becuase the FASTDB program does not account for N- and C- 
terminal truncations of the subject sequence when calculating global percent identity. 

25 For subject sequences truncated at the N- and C-termini, relative to the the query 

sequence, the percent identity is corrected by calculating the number of residues of the 
query sequence that are N- and C-terminal of the subject sequence, which are not 
matched/aligned with a corresponding subject residue, as a percent of the total bases of 
the query sequence. Whether a residue is matched/aligned is determined by results of 

30 the FASTDB sequence alignment This percentage is then subtracted from the percent 
identity, calculated by the above FASTDB program using the specified parameters, to 
arrive at a final percent identity score. This final percent identity score is what is used 
for the purposes of the present invention. Only residues to the N- and C-termini of the 
subject sequence, which are not matched/ahgned with the query sequence, are 

35 considered for the purposes of manually adjusting the percent identity score. That is, 
only query residue positions outside the farthest N- and C-terminal residues of the 
subject sequence. 
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For example, a 90 amino acid residue subject sequence is aligned with a 100 
residue query sequence to determine percent identity. The deletion occurs at the N- 
terminus of the subject sequence and therefore, the FASTDB alignment does not show 
a matching/alignment of the first 10 residues at the N-terminus. The 10 unpaired 
5 residues represent 10% of the sequence (number of residues at the N- and C- termini 
not matched/total number of residues in the query sequence) so 10% is subtracted from 
the percent identity score calculated by the FASTDB program. If the remaining 90 
residues were perfectly matched the final percent identity would be 90%, In another 
example, a 90 residue subject sequence is compared with a 100 residue query sequence. 

10 This time the deletions are internal deletions so there are no residues at the N- or C- 
termini of the subject sequence which are not matched/aligned with the query. In this 
case the percent identity calculated by FASTDB is not manually corrected. Once again, 
only residue positions outside the N- and C-terminal ends of the subject sequence, as 
displayed in the FASTDB alignment, which are not matched/aligned with the query 

15 sequnce are manually corrected for. No other manual corrections arc to made for the 
purposes of the present invention. 

The variants may contain alterations in the coding regions, non-coding regions, 
or both. Especially preferred are polynucleotide variants containing alterations which 
produce silent substitutions, additions, or deletions, but do not alter the properties or 

20 activities of the encoded polypeptide. Nucleotide variants produced by silent 
substitutions due to the degeneracy of the genetic code are preferred. Moreover, 
variants in which 5-10, 1-5, or 1-2 amino acids arc substituted, deleted, or added in any 
combination are also preferred Polynucleotide variants can be produced for a variety 
of reasons, e.g., to optimize codon expression for a particular host (change codons in 

25 the human mRNA to th<^ preferred by a bacterial host such as E. coli). 

Naturally occurring variants are called "allelic variants," and refer to one of 
several alternate forms of a gene occupying a given locus on a chromosonne of an 
organism. (Genes H, Lewin, B., ed., John Wiley & Sons, New York (1985).) These 
allelic variants can vary at either the polynucleotide and/or polypeptide level. 

30 Alternatively, non-naturally occurring variants may be produced by mutagenesis 
techniques or by direct synthesis. 

Using known methods of protein engineering and recombinant DNA 
technology, variants may be generated to improve or alter the characteristics of the 
polypeptides of the present invention. For instance, one or more amino acids can be 

35 deleted from the N-terminus or C-terminus of the secreted protein without substantial 
loss of biological function. The autiiors of Ron et al., J. Biol. Chem. 268: 2984-2988 
(1993), reported variant KGF proteins having heparin binding activity even after 
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deleting 3, 8, or 27 amino-terminal amino acid residues. Similarly, Interferon gamma 
exhibited up to ten times higher activity after deleting 8-10 amino acid residues from the 
carboxy terminus of this protein. (Dobeli et al., J. Biotechnology 7:199-216 (1988).) 
Moreover, ample evidence demonstrates that variants often retain a biological 
5 activity similar to that of the naturally occurring protein. For example, Gayle and 
coworkers (J. Biol. Chem 268:22105-221 1 1 (1993)) conducted extensive mutational 
analysis of human cytokine E.-la. They used random mutagenesis to generate over 
3,500 individual IL-la mutants that averaged 2.5 amino acid changes per variant over 
the entire length of the molecule. Multiple mutations were examined at every possible 
10 amino acid position. The investigators found that "[m]ost of the molecule could be 
altered with little effect on either [binding or biological activity]." (See, Abstract.) In 
fact, only 23 unique amino acid sequences, out of more than 3,500 nucleotide 
sequences examined, produced a protein that significantiy differed in activity ftom wild- 
type. 

15 Furthermore, even if deleting one or more amino acids from the N-terminus or 

C-terminus of a polypeptide results in modification or loss of one or more biological 
functions, other biological activities may still be retained. For example, the ability of a 
deletion variant to induce and/or to bind antibodies which recognize the secreted form 
will likely be retained when less than the majority of the residues of the secreted form 

20 are removed from the N-terminus or C-terminus. Whether a pardcular polypeptide 
lacking N- or C-terminal residues of a protein retains such immunogenic activities can 
readily be determined by routine methods described herein and otherwise known in the 
art 

Thus, the invention further includes polypeptide variants which show 
25 substantial biological activity. Such variants include deletions, insertions, inversions, 
repeats, and substitutions selected according to general rules known in the art so as 
have litde effect on activity. For exanq>le, guidance concerning how to make 
phenotypically silent amino acid substitutions is provided in Bowie, J. U. et al.. 
Science 247:1306-1310 (1990), wherein the authors indicate that there are two main 
30 strategies for studying the tolerance of an amino acid sequence to change. 

The first strategy exploits the tolerance of amino acid substitutions by natural 
selection during the process of evolution. By comparing amino acid sequences in 
different species, conserved amino acids can be identified These conserved amino 
acids are likely important for protein function. In contrast, the amino acid positions 
35 where substitutions have been tolerated by natural selection indicates that these 

positions arc not critical for protein function. Thus, positions tolerating amino acid 
substitution could be modified while still maintaining biological activity of the protein. 
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The second strategy uses genetic engineering to introduce amino acid changes at 
specific positions of a cloned gene to identify regions critical for protein function. For 
example, site directed mutagenesis or alanine-scanning mutagenesis (introduction of 
single alanine mutations at every residue in the molecule) can be used. (Cunningham 
5 and Wells, Science 244:1081-1085 (1989).) The resulting mutant molecules can then 
be tested for biological activity. 

As the authors state, these two strategies have revealed that proteins are 
surprisingly tolerant of amino acid substitutions. The authors further indicate which 
amino acid changes are likely to be permissive at certain amino acid positions in the 

10 protein. For example, most buried (within the tertiary structure of the protein) amino 
acid residues require nonpolar side chams, whereas few features of surface side chains 
are generally conserved. Moreover, tolerated conservative amino acid substitutions 
involve replacement of the aliphatic or hydrophobic amino acids Ala, Val, Leu and He; 
replacement of the hydroxyl residues Ser and Thr, replacement of the acidic residues 

1 5 Asp and Glu; replacement of the amide residues Asn and Gin, replacement of the basic 
residues Lys, Arg, and His; replacement of the aromatic residues Phe, Tyr, and Trp, 
and replacement of the small-sized amino acids Ala, Ser, Thr, Met, and Gly. 

Besides conservative amino acid substitution, variants of the present invention 
include (i) substitutions with one or more of the non-conserved amino acid residues, 

20 where the sutetituted amino acid residues may or may not be one encoded by the 
genetic code, or (ii) substitution with one or more of amino acid residues having a 
substituent group, or (iii) fusion of the mature polypeptide with another compound, 
such as a compound to increase the stability and/or solubility of the polypeptide (for 
example, polyethylene glycol), or (iv) fusion of the polypeptide with additional amino 

25 acids, such as an IgG Fc fusion region peptide, or leader or secretory sequence, or a 
sequence facilitating purification. Such variant polypeptides are deemed to be within 
the scope of those skilled in the art fh)m the teachings herein. 

For example, polypeptide variants containing amino acid substitutions of 
charged amino acids with other charged or neutral amino acids may produce proteins 

30 with improved characteristics, such as less aggregation. Aggregation of pharmaceutical 
formulations both reduces activity and increases clearance due to the aggregate*s 
immunogenic activity. (Pinckard et al., Qin. Exp. Immunol. 2:331-340 (1967); 
Robbins et al.. Diabetes 36: 838-845 (1987); Cleland et al., Crit. Rev. Therapeutic 
Drug Carrier Systems 10:307-377 (1993).) 
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Polynucleotide and Polypeptide Fragments 

In the present invention, a "polynucleotide fragment" refers to a short 
polynucleotide having a nucleic acid sequence contained in the deposited clone or 
shown m SEQ ID NO:X. The short nucleotide fragments are preferably at least about 
5 15 nt, and more preferably at least about 20 nt, still more preferably at least about 30 nt, 
and even more preferably, at least about 40 nt in length. A fragment "at least 20 nt in 
length," for example, is intended to include 20 or more contiguous bases from the 
cDNA sequence contained in the deposited clone or the nucleotide sequence shown in 
SEQ ID NO:X. These nucleotide fragments are useful as diagnostic probes and primers 

10 as discussed herein. Of course, larger fragments (e.g., 50, 150, 500, 600, 2(K)0 
nucleotides) are preferred. 

Moreover, representative examples of polynucleotide fragments of the 
invention, include, for example, fragments having a sequence from about nucleotide 
number 1-50, 51-100, 101-150, 151-200, 201-250, 251-300, 301-350, 351-400, 401- 

15 450, 451-500, 501-550. 551-600, 651-700, 701-750, 751-800, 800-850, 851-900, 
901-950,951-1000, 1001-1050, 1051-1100, 1101-1150, 1151-1200, 1201-1250, 
1251-1300, 1301-1350, 1351-1400, 1401-1450, 1451-1500. 1501-1550, 1551-1600, 
1601-1650, 1651-1700, 1701-1750, 1751-1800, 1801-1850, 1851-1900, 1901-1950, 
195 1-2000, or 2001 to the end of SEQ ID NO:X or the cDNA contained m the 

20 deposited clone. In this context "about" includes the particularly recited ranges, larger 
or smaller by several (5, 4, 3, 2, or 1) nucleotides, at either terminus or at both termini. 
Preferably, these fragments encode a polypeptide which has biological activity. More 
preferably, these polynucleotides can be used as probes or primers as discussed herein. 
In the present invention, a "polypeptide fragment" refers to a short amino acid 

25 sequence contained in SEQ ID NO: Y or encoded by the cDN A contained in the 

deposited clone. Protein fragments may be "free-standing," or comprised within a 
larger polypeptide of which the fragment forms a part or region, most preferably as a 
single continuous region. Representative examples of polypeptide fragments of the 
invention, include, for example, fragments from about amino acid number 1-20, 21-40, 

30 41-60, 61-80, 81-100, 102-120, 121-140, 141-160, or 161 to the end of the coding 
region. Moreover, polypeptide fragments can be about 20, 30, 40, 50, 60, 70, 80, 90, 
100, 1 10, 120, 130, 140, or 150 amino acids in length. In this context "about" 
includes the particularly recited ranges, larger or smaller by several (5, 4, 3, 2, or 1) 
amino acids, at either extreme or at both extremes. 

35 Preferred polypeptide fragments include the secreted protein as well as the 

mature form. Further preferred polypeptide fragments include the secreted protein or 
the mature form having a continuous series of deleted residues from the amino or the 
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carboxy terminus, or both. For example, any number of amino acids, ranging from 1- 
60, can be deleted from the amino terminus of either the secreted polypeptide or the 
mature form. Similarly, any number of amino acids, ranging from 1-30, can be deleted 
from the carboxy terminus of the secreted protein or mature form. Furthermore, any 
5 combination of the above amino and carboxy terminus deletions are preferred. 
Similarly, polynucleotide fragments encoding these polypeptide fragments are also 
preferred. 

Particularly, N-terminal deletions of the polypeptide of the present invention can 
be described by the general formula m-p, where p is the total number of amino acids in 
10 the polypeptide and m is an integer from 2 to (p-1), and where both of these integers (m 
& p) correspond to the position of the amino acid residue identified in SEQ ID NO: Y. 

Moreover, C-tenninal deletions of the polypeptide of the present invention can 
also be described by the general formula 1-n, where n is an integer from 2 to (p-1), and 
again where these integers (n & p) correspond to the position of the amino acid residue 
15 identified in SEQ ID NO:Y. 

The invention also provides polypeptides having one or more amino acids 
deleted from both the amino and the carboxyl termini, which may be described 
generally as having residues m-n of SEQ ID NO: Y, where m and n are integers as 
described above. 

20 Also preferred are polypeptide and polynucleotide fragments characterized by 

structural or functional domains, such as fragments that comprise alpha-helix and alpha- 
helix forming regions, beta-sheet and beta-sheet-forming regions, turn and turn- 
forming regions, coil and coil-forming regions, hydraphilic regions, hydrophobic 
regions, alpha amphipathic regions, beta amphipathic regions, flexible regions, surface- 

25 forming regions, substrate binding region, and high antigexuc index regions. 
Polypeptide fragments of SEQ ID NO: Y falling within conserved domains are 
specifically contenq)lated by the present invention. Moreover, polynucleotide 
fragments encoding these domains are also contemplated. 

Other preferred fragments are biologically active fragments. Biologically active 

30 fragments arc those exhibiting activity similar, but not necessarily identical, to an 
activity of the polypeptide of the present invention. The biological activity of the 
fragments may include an improved desired activity, or a decreased undesirable activity. 

Epitopes & Antibodies 

35 In the present invention, "epitopes" refer to polypeptide fiagments having 

antigenic or immunogenic activity in an animal, especially in a hiunan. A preferred 
embodiment of the present invention relates to a polypeptide fragment comprising an 
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epitope, as well as the polynucleotide encoding this fragment. A region of a protein 
molecule to which an antibody can bind is defined as an "antigenic epitope." hi 
contrast, an "inmiunogenic epitope" is defined as a part of a protein that elicits an 
antibody response. (See, for instance, Geysen et al., Proc. Natl, Acad. Sci. USA 
5 81:3998-4002(1983).) 

Fragments which fiinction as epitopes may be produced by any conventional 
means. (See, e.g., Houghten, R. A., Proc, Natl. Acad. Sci. USA 82:5131-5135 
(1985) further described in U.S. Patent No. 4,631,21 1.) 

Li the present invention, antigenic epitopes preferably contain a sequence of at 

10 least seven, more preferably at least nine, and most preferably between about 15 to 
about 30 amino acids. Antigenic epitopes are useful to raise antibodies, including 
monoclonal antibodies, that specifically bind the epitope. (See, for instance, Wilson et 
al„ Cell 37:767-778 (1984); Sutcliffe, J. G. et al.. Science 219:660-666 (1983).) 

Similarly, immunogenic epitopes can be used to induce antibodies according to 

15 methods well known in the art. (See, for instance, Sutchffe et al., supra; Wilson et al., 
supra; Chow, M. et al., Proc. Nad. Acad. Sci. USA 82:910-914; and BitUe, F. J. et 
al., J. Gen, Virol. 66:2347-2354 (1985).) A preferred inmiunogenic epitope includes 
the secreted protein. The immunogenic epitopes may be presented together with a 
carrier protein, such as an albumin, to an animal system (such as rabbit or mouse) or, if 

20 it is long enough (at least about 25 amino acids), without a carrier. However. 

immunogenic epitopes comprising as few as 8 to 10 amino acids have been shown to be 
sufficient to raise antibodies capable of binding to, at the very least, Unear epitopes in a 
denatured polypeptide (e.g., in Western blotting.) 

As used herein, the term "antibody" (Ab) or "monoclonal antibody" (Mab) is 

25 meant to include intact molecules as well as antibody fragments (such as, for example. 
Fab and F(ab')2 fragments) which are capable of specifically binding to protein. Fab 
and F(ab')2 fragments lack the Fc fragment of intact antibody, clear more rapidly from 
the circulation, and may have less non-^)ecific tissue binding than an intact antibody. 
(Wahl et al., J. Nucl. Med. 24:316-325 (1983).) Thus, these fragments are preferred, 

30 as well as the products of a FAB or other immunoglobuhn expression library. 
Moreover, andbodies of the present invention include chimeric, single chain, and 
humanized antibodies. 

Fusion Proteins 

35 Any polypeptide of the present invention can be used to generate fusion 

proteins. For example, the polypeptide of the present invention, when fused to a 
second protein, can be used as an antigenic tag. Antibodies raised against the 
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polypeptide of the present invention can be used to indirectly detect the second protein 
by binding to the polypeptide. Moreover, because secreted proteins target cellular 
locations based on trafficking signals, the polypeptides of the present invention can be 
used as targeting molecules once fused to other proteins. 
5 Examples of domains that can be fused to polypeptides of the present invention 

include not only heterologous signal sequences, but also other heterologous functional 
regions. The fusion does not necessarily need to be direct, but may occur through 
linker sequences. 

Moreover, fusion proteins may also be engineered to improve characteristics of 
10 the polypeptide of the present invention. For instance, a region of additional anuno 
acids, particularly charged amino acids, may be added to the N-terminus of the 

r 

polypeptide to improve stability and persistence during purification from the host cell or 
subsequent handling and storage. Also, peptide moieties may be added to the 
polypeptide to facihtate purification. Such regions may be removed prior to final 

1 5 preparation of the polypeptide. The addition of peptide moieties to facilitate handling of 
polypeptides are familiar and routine techniques in the ait 

Moreover, polypeptides of the present invention, including fragments, and 
specifically epitopes, can be combined with parts of the constant domain of 
inununoglobulins (IgG), resulting in chimeric polypeptides. These fusion proteins 

20 facilitate purification and show an increased half-life in vivo. One reported example 
describes chimeric proteins consisting of the first two domains of the human CD4- 
polypeptide and various domains of the constant regions of the heavy or light chains of 
mammalian inmiunoglobulins. (EP A 394,827; Traunecker et al.. Nature 33 1 :84-86 
(1988).) Fusion proteins having disulfide-linked dimeric structures (due to the IgG) 

25 can also be more efficient in binding and neutralizing other molecules, than the 
monomeric secreted protein or protein fragment alone. (Fountoulakis et al., J. 
Biochem. 270:3958-3964 (1995).) 

Similarly, EP-A-O 464 533 (Canadian counterpart 2045869) discloses fusion 
proteins comprising various portions of constant region of immunoglobulin molecules 

30 together with another human protein or part thereof. In many cases, the Fc part in a 
fiision protein is beneficial in therapy and diagnosis, and thus can result in, for 
example, improved pharmacokinetic properties. (EP-A 0232 262.) Altematively, 
deleting the Fc part after the fusion protein has been expressed, detected, and purified, 
would be desired. For example, the Fc portion may hinder therapy and diagnosis if the 

35 fusion protein is used as an antigen for inmiunizations. In drug discovery, for 

example, htmian proteins, such as hIL-5, have been fiised with Fc portions for the 
purpose of high-tiiroughput screening assays to identify antagonists of hIL-5. (See, D. 
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Bennett et al., J. Molecular Recognition 8:52-58 (1995); K. Johanson et al., J. Biol. 
Chem. 270:9459-9471 (1995).) 

Moreover, the polypeptides of the present invention can be fused to marker, 
sequences, such as a peptide which facilitates purification of the fused polypeptide. In 
5 preferred embodiments, the marker amino acid sequence is a hexa-histidine peptide, 
such as the tag provided in a pQE vector (QIAGEN, Inc., 9259 Eton Avenue, 
Chatsworth, CA, 91311), among others, many of which are commercially available. 
As described in Gentz et al., Proc. Natl. Acad. Sci. USA 86:821-824 (1989), for 
instance, hexa-histidine provides for convenient purification of the fusion protein. 
10 Another peptide tag useful for purification, the "HA" tag, corresponds to an epitope 

derived from the influenza hemagglutinin protein. (Wilson etal.. Cell 37:767(1984).) 

Thus, any of these above fusions can be engineered using the polynucleotides 
or the polypeptides of the present invention. 

15 Vectors^ Host Cells, and Protein Production 

The present invention also relates to vectors containing the polynucleotide of the 
present invention, host cells, and the production of polypeptides by recombinant 
techniques. The vector niay be, for example, a phage, plasmid, viral, or retroviral 
vector. Retroviral vectors may be replication competent or replication defective. In the 

20 latter case, viral propagation generally will occur only in complementing host cells. 

The pol)mucleotides may be joined to a vector containing a selectable marker for 
propagation in a host Generally, a plasmid vector is introduced in a precipitate, such 
as a calcium phosphate precipitate, or in a complex with a charged lipid. If the vector is 
a virus, it may be packaged in vitro using an appropriate packaging cell line and then 

25 transduced into host cells. 

The polynucleotide insert should be operatively linked to an appropriate 
promoter, such as the phage lambda PL promoter, the E. coli lac, trp, phoA and tac 
promoters, the S V40 early and late promoters and promoters of retroviral LTRs, to 
name a few. Other suitable promoters will be knoAvn to the skilled artisan. The 

30 expression constmcts will further contain sites for transcription initiation, termination, 
and, in the transcribed region, a ribosome binding site for translation. The coding 
portion of the transcripts expressed by the constmcts will preferably include a 
translation initiating codon at the beginning and a termination codon (UAA, UGA or 
UAG) appropriately positioned at the end of the polypeptide to be translated. 

35 As indicated, the expression vectors will preferably include at least one 

selectable marker. Such maricers include dihydrofolate reductase, G418 or neomycin 
resistance for eukaryotic cell culture and tetracycline, kanamycin or ampicillin resistance 
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genes for culturing in E. coli and other bacteria. Representative examples of 
appropriate hosts include, but are not limited to, bacterial cells, such as E. coli, 
Streptomyces and Salmonella typhimurium cells; fungal cells, such as yeast cells; insect 
cells such as Drosophila S2 and Spodoptera Sf9 cells; animal cells such as CHO, COS, 
5 293, and Bowes melanoma cells; and plant cells. Appropriate culture mediums and 
conditions for the above-described host cells are known in the art. 

Among vectors preferred for use in bacteria include pQE70, pQE60 and pQE-9, 
available from QIAGEN, Inc.; pBluescript vectors, Phagescript vectors, pNH8A, 
pNH16a, pNHlSA, pNH46A, available from Stratagene Cloning Systems, Inc.; and 

10 ptrc99a, pKK223-3, pKK233-3, pDR540, pRTTS available from Pharmacia Biotech, 
Inc. Among prefened eukaryotic vectors are pWLNEO, pSV2CAT, pC)G44, pXTl 
and pSG available from Stratagene; and pSVK3, pBPV, pMSG and pSVL available 
from Pharmacia. Other suitable vectors will be readily apparent to the skilled artisan. 
Introduction of the construct into the host cell can be effected by calcium 

1 5 phosphate transfection, DEAE-dextran mediated transfection, cationic lipid-mediated 
transfection, electroporation, transduction, infection, or other methods. Such methods 
are described in many standard laboratory manuals, such as Davis et al., Basic Methods 
In Molecular Biology (1986). It is specifically contemplated that the polypeptides of the 
present invention may in fact be expressed by a host cell lacking a recombinant vector. 

20 A polypeptide of this invention can be recovered and purified from recombinant 

cell cultures by well-known methods including ammonium sulfate or ethanol 
precipitation, acid extraction, anion or cation exchange chromatography, 
phosphoceUuIose chromatography, hydrophobic interaction chromatography, affinity 
chromatogn^hy, bydroxylapatite chromatography and lectin chromatography. Most 

25 preferably, high performance liquid chromatography ("HPLC") is employed for 
purification. 

Polypeptides of the present invention, and preferably the secreted form, can also 
be recovered from: products purified firom natural sources, including bodily fluids, 
tissues and cells, whether directly isolated or cultured; products of chemical synthetic 

30 procedures; and products produced by recombinant techniques from a prokaryotic or 
eukaryotic host, including, for example, bacterial, yeast, higher plant, insect, and 
mammalian cells. Depending upon the host employed in a recombinant production 
procedure, the polypeptides of the present invention may be glycosylated or may be 
non-glycosylated. In addition, polypeptides of the invention may also include an initial 

35 modified methionine residue, in some cases as a result of host-mediated processes. 
Thus, it is well known in the art that the N-terminal methionine encoded by the 
translation initiation codon generally is removed with high efficiency from any protein 
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after translation in all eukaryotic cells. While the N-terminal methionine on most 
proteins also is efficiently removed in most prokaryotes, for some proteins, this 
prokaryotic removal process is inefficient, depending on the nature of the amino acid to 
which the N-terminal methionine is covalently linked. 

Uses of the Polynucleotides 

Each of the polynucleotides identified herein can be used in numerous ways as 
reagents. The following description should be considered exemplary and utilizes 
known techniques. 

The polynucleotides of the present invention are useful for chromosome 
identification. There exists an ongoing need to identify new chromosome markers, 
since few chromosome marking reagents, based on actual sequence data (repeat 
polymorphisms), are presently available. Each polynucleotide of the present invention 
can be used as a chromosome marker. 

Briefly, sequences can be mapped to chromosomes by preparing PCR primers 
(preferably 15-25 bp) from the sequences shown in SEQ ID NO:X. Primers can be 
selected using computer analysis so that primers do not span more than one predicted 
exon in the genomic DNA. These primers are then used for PCR screening of somatic 
cell hybrids containing individual human chroniosomes. Only those hybrids containing 
the human gene corresponding to the SEQ ID NO:X will yield an amplified Iragment. 

Similarly, somatic hybrids provide a rapid method of PCR m£q)ping the 
polynucleotides to particular chromosomes. Three or more clones can be assigned per 
day using a single thermal cycler. Moreover, sublocalization of the polynucleotides can 
be achieved with panels of specific chromosome fragments. Other gene mapping 
strategies that can be used include in situ hybridization, prescreening with labeled flow- 
sorted chromosomes, and preselection by hybridization to construct chromosome 
spedfic-cDNA libraries. 

Precise chromosomal location of the polynucleotides can also be achieved using 
fluorescence in situ hybridization (FISH) of a n^taphase chromosomal spread. This 
technique uses polynucleotides as short as 500 or 600 bases; however, polynucleotides 
2,000-4,000 bp are preferred. For a review of this technique, see Verma et al., 
"Human Chromosomes: a Manual of Basic Techniques," Pergamon Press, New York 
(1988). 

For chromosome moping, the polynucleotides can be used individually (to 
mark a single chromosome or a single site on that chromosome) or m panels (for 
marking multiple sites and/or multiple chromosomes). Preferred polynucleotides 
correspond to the noncoding regions of the cDNAs because the coding sequences are 
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more likely conserved within gene families, thus increasing the chance of cross 
hybridization during chromosomal mapping. 

Once a polynucleotide has been mapped to a precise chromosomal location, the 
physical position of the polynucleotide can be used in linkage analysis. Linkage 
5 analysis establishes coinheritance between a chromosomal location and presentation of a 
particular disease. (Disease mapping data are found, for example, in V. McKusick, 
Mendelian Inheritance in Man (available on line through Johns Hopkins University 
Welch Medical Library) .) Assuming 1 megabase mapping resolution and one gene per 
20 kb, a cDNA precisely localized to a chromosomal region associated with the disease 

1 0 could be one of 50-500 potential causative genes. 

Thus, once coinheritance is established, differences in the polynucleotide and 
the corresponding gene between affected and unaffected individuals can be examined. 
First, visible structural alterations in the chromosomes, such as deletions or 
translocations, are examined in chromosome spreads or by PCR. If no structural 

15 alterations exist, the presence of point mutations are ascertained. Mutations observed in 
some or all affected individuals, but not in normal individuals, indicates that the 
mutation may cause the disease. However, complete sequencing of the polypeptide and 
the corresponding gene from several normal individuals is required to distinguish the 
mutation from a polymorphism. If a new polymorphism is identified, this polymorphic 

20 polypeptide can be used for further linkage analysis. 

Furthermore, increased or decreased expression of the gene in affected 
individuals as compared to unaffected individuals can be assessed using 
polynucleotides of the present invention. Any of these alterations (altered expression, 
chromosomal rearrangennent, or mutation) can be used as a diagnostic or prognostic 

25 marker. 

In addition to the foregoing, a polynucleotide can be used to control gene 
expression through triple helix formation or antisense DNA or RN A. Both methods 
rely on binding of the polynucleotide to DNA or RNA. For these techniques, preferred 
polynucleotides are usually 20 to 40 bases in length and complementary to either the 

30 region of the gene involved in transcription (triple helix - see Lee et al., Nucl. Acids 
Res. 6:3073 (1979); Cooney et al.. Science 241:456 (1988); and Dervan et al.. Science 
251:1360 (1991) ) or to the mRNA itself (antisense - Okano, J. Neurochem. 56:560 
(1991); Oligodeoxy-nucleotides as Antisense Inhibitors of Gene Expression, CRC 
Press, Boca Raton, FL (1988).) Triple helix formation optimally results in a shut-off 

35 of RNA transcription from DNA, while antisense RNA hybridization blocks translation 
of an mRNA molecule into polypeptide. Both techniques are effective in model 
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systems, and the information disclosed herein can be used to design antisense or triple 
helix polynucleotides in an effort to treat disease. 

Polynucleotides of the present invention are also useful in gene therapy. One 
goal of gene therapy is to insert a normal gene into an organism having a defective 
5 gene, in an effort to correct the genetic defect. The polynucleotides disclosed in the 
present invention offer a means of targeting such genetic defects in a highly accurate 
maimer. Another goal is to insert a new gene that was not present in the host genome, 
thereby producing a new trait in the host cell. 

The polynucleotides are also usefid for identifying individuals from minute 
10 biological samples. The United States military, for example, is considering the use of 
restriction fragment length polymorphism (RFLP) for identification of its personnel. In 
this technique, an individual's genomic DNA is digested with one or more restriction 
enzymes, and probed on a Southern blot to yield unique bands for identifying 
personnel. This method does not suffer from the current limitations of "Dog Tags" 
15 which can be lost, switched, or stolen, making positive identification difficult. The 
polynucleotides of the present invention can be used as additional DNA markers for 
RFLP. 

The polynucleotides of the present invention can also be used as an alternative to 
RFLP, by determining the actual base-by-base DNA sequence of selected portions of an 

20 individual's genome. These sequences can be used to prepare PCR primers for 

amplifying and isolating such selected DNA, which can then be sequenced Using this 
technique, individuals can be identified because each individual will have a imique set 
of DNA sequences. Once an unique ID database is established for an individual, 
positive identification of that individual, living or dead, can be made from extremely 

25 smaU tissue samples. 

Forensic biology also benefits from using DNA-based identification techniques 
as disclosed herein. DNA sequences taken from very smaD biological samples such as 
tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, semen, etc., can be 
amplified using PCR. In one prior art technique, gene sequences amplified from 

30 polymorphic loci, such as DQa class n HLA gene, are used in forensic biology to 

identify individuals. (Erlich, H., PCR Technology, Freeman and Co. (1992).) Once 
these specific polymorphic loci are amplified, they are digested with one or more 
restriction enzymes, yielding an identifying set of bands on a Southem blot probed with 
DNA corresponding to the DQa class II HLA gene. Sunilarly, polynucleotides of tiie 

35 present invention can be used as polymorphic markers for forensic purposes. 

There is also a need for reagents capable of identifying the source of a particular 
tissue. Such need arises, for example, in forensics when presented with tissue of 
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unknown origin. Appropriate reagents can comprise, for example, DNA probes or 
primers specific to particular tissue prepared from the sequences of the present 
invention. Panels of such reagents can identify tissue by species and/or by organ type. 
In a similar fashion, these reagents can be used to screen tissue cultures for 
5 contamination. 

In the very least, the polynucleotides of the present invention can be used as 
molecular weight markers on Southern gels, as diagnostic probes for the presence of a 
specific mRNA in a particular cell type, as a probe to "subtract-out" known sequences 
in the process of discovering novel polynucleotides, for selecting and making oligomers 
10 for attachment to a "gene chip" or other support, to raise anti-DNA antibodies using 
DNA inmiunization techniques, and as an antigen to elicit an immune response. 

Uses of the Polypeptides 

Each of the polypeptides identified herein can be used in numerous ways. The 
15 following description should be considered exemplary and utilizes known techniques. 

A polypeptide of the present invention can be used to assay protein levels in a 
biological sample using antibody-based techniques. For example, protein expression in 
tissues can be studied with classical immunohistological methods. (Jalkanen, M., et 
al., J. Cell. Biol. 101:976-985 (1985); Jalkanen, M., et al., J. Cell . Biol. 105:3087- 
20 3096 ( 1 987).) Other antibody-based methods usefiil for detecting protein gene 

expression include inmiunoassays, such as the enzyme linked inmiunosorbent assay 
(ELISA) and the radioimmunoassay (RIA). Suitable antibody assay labels are known 
in the art and include enzyme labels, such as, glucose oxidase, and radioisotopes, such 
as iodine (1251, 1211), carbon (14C), sulfur (358), tritium (3H), indium (1 12In), and 
25 technetium (99mTc), and fluorescent labels, such as fluorescein and ±odamine, and 
biotin. 

In addition to assaying secreted protein levels in a biological sample, proteins 
can also be detected in vivo by imaging. Antibody labels or markers for in vivo 
iniaging of protein include those detectable by X-radiography, NMR or ESR. For X- 
30 radiography, suitable labels include radioisotopes such as barium or cesium, which emit 
detectable radiation but are not overtiy harmful to the subject. Suitable markers for 
NMR and ESR include those with a detectable characteristic spin, such as deuterium, 
which may be incorporated into the antibody by labeling of nutrients for the relevant 
hybridoma. 

35 A protein-specific antibody or antibody fragment which has been labeled with 

an appropriate detectable imaging moiety, such as a radioisotope (for example, 1311, 
1 12In, 99mTc), a radio-opaque substance, or a material detectable by nuclear magnetic 
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resonance, is introduced (for example, parenterally, subcutaneously, or 
intraperitoneally) into the mammal. It will be understood in the art that the size of the 
subject and the imaging system used will determine the quantity of imaging moiety 
needed to produce diagnostic images. In the case of a radioisotope moiety, for a human 
5 subject, the quantity of radioactivity injected will normally range from about 5 to 20 
millicuries of 99mTc. The labeled antibody or antibody fragment will then 
preferentially accumulate at the location of cells which contain the specific protein. In 
vivo tumor imaging is described in S.W. Burchiel et al., "Immunopharmacokinetics of 
Radiolabeled Antibodies and Their Fragments." (Chapter 13 in Tumor Imaging: The 

10 Radiochemical Detection of Cancer, S.W. Burchiel and B. A. Rhodes, eds., Masson 
Publishing Inc. (1982).) 

Thus, the invention provides a diagnostic method of a disorder, which involves 
(a) assaying the expression of a polypeptide of the present invention in cells or body 
fluid of an individual; (b) comparing the level of gene expression with a standard gene 

15 expression level, whereby an increase or decrease in the assayed polypeptide gene 
expression level compared to the standard expression level is indicative of a disorder. 

Moreover, polypeptides of the present invention can be used to treat disease. 
For example, patients can be administered a polypeptide of the present invention in an 
effort to replace absent or decreased levels of the polypeptide (e.g., insulin), to 

20 supplement absent or decreased levels of a different polypeptide (e.g., hemoglobin S 
for hemoglobin B), to inhibit the activity of a polypeptide (e.g., an oncogene), to 
activate the activity of a polypeptide (e.g., by binding to a receptor), to reduce the 
activity of a membrane bound receptor by competing with it for free ligand (e.g., 
soluble TNF receptors used in reducing inflammation), or to bring about a desired 

25 response (e.g., blood vessel growth). 

Similarly, antibodies directed to a polypeptide of the present invention can also 
be used to treat disease. For example, administration of an antibody directed to a 
polypeptide of the present invention can bind and reduce overproduction of the 
polypeptide. Similarly, administration of an antibody can activate the polypeptide, such 

30 as by binding to a polypeptide bound to a membrane (receptor). 

At the very least, the polypeptides of the present invention can be used as 
molecular weight markers on SDS-PAGE gels or on molecular sieve gel filtration 
columns using methods well known to those of skill in the art. Polypeptides can also 
be used to raise antibodies, which in turn are used to measure protein expression from a 

35 recombinant cell, as a way of assessing transformation of the host cell. Moreover, the 
polypeptides of the present invention can be used to test the following biological 
activities. 
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Biological Activities 

The polynucleotides and polypeptides of the present invention can be used in 
assays to test for one or more biological activities. If these polynucleotides and 
5 polypeptides do exhibit activity in a particular assay, it is likely that these molecules 
may be involved in the diseases associated with the biological activity. Thus, the 
polynucleotides and polypeptides could be used to treat the associated disease. 

Immune Activity 

10 A polypeptide or polynucleotide of the present invention may be useful in 

treating deficiencies or disorders of the immune system, by activating or inhibiting the 
proliferation, differentiation, or mobilization (chemotaxis) of inmiune cells. Immune 
cells develop through a process called hematopoiesis, producing myeloid (platelets, red 
blood cells, neutrophils, and macrophages) and lymphoid (B and T lymphocytes) cells 

1 5 from pluripotent stem cells. The etiology of these immime deficiencies or disorders 

may be genetic, somatic, such as cancer or some autoinunune disorders, acquired (e.g., 
by chemotherapy or toxins), or infectious. Moreover, a polynucleotide or polypeptide 
of the present invention can be used as a marker or detector of a particular immune 
system disease or disorder. 

20 A polynucleotide or polypeptide of the present invention may be useful in 

treating or detecting deficiencies or disorders of hematopoietic cells. A polypeptide or 
polynucleotide of the present invention could be used to increase differentiation and 
proliferation of hematopoietic cells, including the pluripotent stem cells, in an effort to 
treat those disorders associated with a decrease in certain (or many) types hematopoietic 

25 cells. Examples of immunologic deficiency syndromes include, but are not limited to: 
blood protein disorders (e.g. agammaglobulinemia, dysgammaglobulinemia), ataxia 
telangiectasia, common variable immunodeficiency, Digeorge Syndrome, HIV 
infection, HTLV-BLV infection, leukocyte adhesion deficiency syndrome, 
lymphopenia, phagocyte bactericidal dysfunction, severe combined immunodeficiency 

30 (SCIDs), Wiskott-Aldrich Disorder, anemia, thrombocytopenia, or hemoglobinuria. 

Moreover, a polypeptide or polynucleotide of the present invention could also 
be used to modulate hemostatic (the stopping of bleeding) or thrombolytic activity (clot 
formation). For example, by increasing hemostatic or thrombolytic activity, a 
polynucleotide or polypeptide of the present invention could be used to treat blood 

35 coagulation disorders (e.g., afibrinogenemia, factor deficiencies), blood platelet 

disorders (e.g. thrombocytopenia), or wounds resulting from trauma, surgery, or other 
causes. Alternatively, a polynucleotide or polypeptide of the present invention that can 
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decrease hemostatic or thrombolytic activity could be used to inhibit or dissolve 
clotting. These molecules could be important in the treatment of heart attacks 
(infarction), strokes, or scarring. 

A polynucleotide or polypeptide of the present invention may also be useful in 
5 treating or detecting autoimmune disorders. Many autoinmiune disorders result from 
inappropriate recognition of self as foreign material by immune cells. This 
inappropriate recognition results in an immune response leading to the destruction of the 
host tissue. Therefore, the administration of a polypeptide or polynucleotide of the 
present invention that inhibits an immune response, particularly the proliferation, 

10 differentiation, or chemotaxis of T-cells, may be an effective therapy in preventing 
autoinmiune disorders. 

Examples of autoimmune disorders that can be treated or detected by the present 
invention include, but are not limited to: Addison's Disease, hemolytic anemia, 
antiphospholipid syndrome, rheumatoid arthritis, dermatitis, allergic encephalomyelitis, 

15 glomerulonephritis, Goodpasture's Syndrome, Graves' Disease, Multiple Sclerosis, 
Myasthenia Gravis, Neuritis, Ophthalmia, Bullous Pemphigoid, Pemphigus, 
Polyendocrinopathies, Purpura, Reiter's Disease, Stiff-Man Syndrome, Autoimmune 
Thyroiditis, Systemic Lupus Erythematosus, Autoimmune Pulmonary Inflammation, 
Guillain-Barre Syndrome, insulin dependent diabetes mellitis, and autoimmune 

20 inflammatoiy eye disease. 

Similarly, allergic reactions and conditions, such as asthma (particularly allergic 
asthma) or other respiratory problenas, may also be treated by a polypeptide or 
polynucleotide of the present invention. Moreover, these molecules can be used to treat 
anaphylaxis, hypersensitivity to an antigenic molecule, or blood group incompatibility. 

25 A polynucleotide or polypeptide of the present invention may also be used to 

treat and/or prevent organ rejection or grafl-versus-host disease (GVHD). Organ 
rejection occurs by host immune cell destruction of the transplanted tissue dirough an 
immune response. Similarly, an immune response is also involved in GVHD, but, in 
this case, the foreign transplanted inmaune cells destroy the host tissues. The 

30 administration of a polypeptide or polynucleotide of the present invention that inhibits 
an immune response, particularly the proliferation, differentiation, or chemotaxis of T- 
cells, may be an effective therapy in preventing organ rejection or GVHD. 

Similarly, a polypq)tide or polynucleotide of the present invention may also be 
used to modulate inflammation. For example, the polypeptide or polynucleotide may 

35 inhibit the proliferation and differentiation of cells involved in an inflanmiatoiy 

response. These molecules can be used to treat inflammatory conditions, both chronic 
and acute conditions, including inflanmiation associated with infection (e.g., septic 
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shock, sepsis, or systemic inflammatory response syndrome (SIRS)), ischemia- 
reperfusion injury, endotoxin lethality, arthritis, complement-mediated hyperacute 
rejection, nephritis, cytokine or chemokine induced lung injury, inflammatory bowel 
disease, Crohn's disease, or resulting from over production of cytokines (e.g., TNF or 
5 IL-1.) 

Hvperproliferative Disorders 

A polypeptide or polynucleotide can be used to treat or detect hyperproliferative 
disorders, including neoplasms. A polypeptide or polynucleotide of the present 

10 invention may inhibit the proliferation of the disorder through direct or indirect 

interactions. Alternatively, a polypeptide or polynucleotide of the present invention 
may proliferate other cells which can inhibit the hyperproliferative disorder. 

For example, by increasing an immune response, particularly increasing 
antigenic qualities of the hyperproUferative disorder or by proliferating, differentiating, 

15 or mobilizing T-cells, hyperproliferative disorders can be treated. This immune 

response may be increased by either enhancing an existing inmiune response, or by 
initiating a new inmiune response. Alternatively, decreasing an inmiune response may 
also be a method of treating hyperproliferative disorders, such as a chemotherapeutic 
agent. 

20 Examples of hyperproliferative disorders that can be treated or detected by a 

polynucleotide or polypeptide of the present invention include, but are not limited to 
neoplasms located in the: abdomen, bone, breast, digestive system, liver, pancreas, 
peritoneum, endocrine glands (adrenal, parathyroid, pituitary, testicles, ovary, thymus, 
thyroid), eye, head and neck, nervous (central and peripheral), lymphatic system, 

25 pelvic, skin, soft tissue, spleen, thoracic, and urogenital. 

Similarly, other hyperproliferative disorders can also be treated or detected by a 
polynucleotide or polypeptide of the present invention. Examples of such 
hyperproliferative disorders include, but are not limited to: hyperganmiaglobulinemia, 
lymphoproliferative disorders, paraproteinemias, purpura, sarcoidosis, Sezary 

30 Syndrome, Waldenstron*s Macroglobulinemia, Gaucher's Disease, histiocytosis, and 
any other hyperproliferative disease, besides neoplasia, located in an organ system 
listed above. 

Infectious Disease 

35 A polypeptide or polynucleotide of the present invention can be used to treat or 

detect infectious agents. For example, by increasing the immune response, particularly 
increasing the proliferation and differentiation of B and/or T cells, infectious diseases 
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may be treated. The immune response may be increased by either enhancing an existing 
inmiune response, or by initiating a new immune response. Alternatively, the 
polypeptide or polynucleotide of the present invention may also directly inhibit the 
infectious agent, without necessarily eliciting an immune response. 
5 Viruses are one example of an infectious agent that can cause disease or 

symptoms that can be treated or detected by a polynucleotide or polypeptide of the 
present invention. Examples of viruses, include, but are not limited to the following 
DNA and RNA viral families: Arbovirus, Adenoviridae, Arenaviridae, Arterivirus, 
Bimaviridae, Bunyaviridae, Caliciviridae, Circoviridae, Coronaviridae, Flaviviridae, 

10 Hepadnaviridae (Hepatitis), Herpes viridae (such as. Cytomegalovirus, Herpes 
Simplex, Herpes Zoster), Mononegavirus (e.g., Paramyxoviridae, Morbillivirus, 
Rhabdoviridae), Orthomyxoviridae (e.g.. Influenza), Papovaviridae, Parvoviridae, 
Picomaviridae. Poxviridae (such as Smallpox or Vaccinia), Reoviridae (e.g.. 
Rotavirus), Retroviridae (HTLV-I, HTLV-II, Lentivirus), and Togaviridae (e.g., 

15 Rubi virus). Viruses falling within these families can cause a variety of diseases or 
symptoms, including, but not limited to: arthritis, bronchioUitis, encephalitis, eye 
infections (e.g., conjunctivitis, keratitis), chronic fatigue syndrome, hepatitis (A, B, C, 
E, Chronic Active, Delta), meningitis, opportunistic infections (e.g., AIDS), 
pneumonia, Burkitt's Lymphoma, chickenpox , hemorrhagic fever. Measles, Mumps, 

20 Parainfluenza, Rabies, the conmion cold. Polio, leukemia. Rubella, sexually 

transmitted diseases, skin diseases (e.g., Kaposi's, warts), and viremia. A polypeptide 
or polynucleotide of the present invention can be used to treat or detect any of these 
symptoms or diseases. 

Similarly, bacterial or fungal agents that can cause disease or symptoms and that 

25 can be treated or detected by a polynucleotide or polypeptide of the present invention 
include, but not limited to, the following Gram-Negative and Gram-positive bacterial 
families and fungi: Actinomycetales (e.g., Corynebacterium, Mycobacterium, 
Norcardia), Aspergillosis, Bacillaceae (e.g.. Anthrax, Clostridium), Bacteroidaceae, 
Blastomycosis, Bordetella, Borrelia, Brucellosis, Candidiasis, Campylobacter, 

30 Coccidioidomycosis, Cryptococcosis, Dermatocycoses, Enterobacteriaceae (Klebsiella, 
Sahnonella, Serratia, Yersinia), Erysipelothrix, Helicobacter, Legionellosis, 
Leptospirosis, Listeria, Mycoplasmatales, Neisseriaceae (e.g., Acinetobacter, 
Gonorrhea, Menigococcal), Pasteurellacea Infections (e.g., Actinobacillus, 
Heamophilus, Pasteurella), Pseudomonas, Rickettsiaceae, Chlamydiaceae, Syphilis, 

35 and St^hylococcal. These bacterial or fungal families can cause the following diseases 
or symptoms, including, but not limited to: bacteremia, endocarditis, eye infections 
(conjunctivitis, tuberculosis, uveitis), gingivitis, opportunistic infections (e.g., AIDS 
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related infections), paronychia, prosthesis-related infections, Reiter's Disease, 
respiratory tract infections, such as Whooping Cough or Empyema, sepsis, Lyme 
Disease, Cat-Scratch Disease, Dysentery, Paratyphoid Fever, food poisoning. 
Typhoid, pneumonia. Gonorrhea, meningitis, Chlamydia, Syphilis, Diphtheria. 
5 Leprosy, Paratuberculosis, Tuberculosis, Lupus, Botulism, gangrene, tetanus, 

impetigo. Rheumatic Fever, Scarlet Fever, sexually transmitted diseases, skin diseases 
(e.g., cellulitis, dermatocycoses), toxemia, urinary tract infections, woimd infections. 
A polypeptide or polynucleotide of the present invention can be used to treat or detect 
any of these symptoms or diseases. 

10 Moreover, parasitic agents causing disease or symptoms that can be treated or 

detected by a polynucleotide or polypeptide of the present invention include, but not 
limited to, the following families: Amebiasis, Babesiosis, Coccidiosis, 
Cryptosporidiosis, Dientamoebiasis, Dourine, Ectoparasitic, Giardiasis, Helminthiasis, 
Leishmaniasis, Theileriasis, Toxoplasmosis, Trypanosomiasis, and Trichomonas. 

15 These parasites can cause a variety of diseases or symptoms, including, but not limited 
to: Scabies, TrombicuUasis, eye infections, intestinal disease (e.g., dysentery, 
giardiasis), liver disease, lung disease, opportunistic infections (e.g., AIDS related). 
Malaria, pregnancy complications, and toxoplasmosis. A polypeptide or polynucleotide 
of the present invention can be used to treat or detect any of these symptoms or 

20 diseases. 

Preferably, treatment using a polypeptide or polynucleotide of the present 
invention could either be by administering an effective amount of a polypeptide to the 
patient, or by removing cells from the patient, supplying the cells with a polynucleotide 
of the present invention, and returning the engineered cells to the patient (ex vivo 
25 therapy). Moreover, the polypeptide or polynucleotide of the present invention can be 
used as an antigen in a vaccine to raise an immune response against infectious disease. 

Regeneration 

A polynucleotide or polypeptide of the present invention can be used to 
30 differentiate, proliferate, and attract cells, leading to the regeneration of tissues. (See, 
Science 276:59-87 (1997).) The regeneration of tissues could be used to repair, 
replace, or protect tissue damaged by congenital defects, trauma (wounds, bums, 
incisions, or ulcers), age, disease (e.g. osteoporosis, osteocarthritis, periodontal 
disease, liver failure), surgery, including cosmetic plastic surgery, fibrosis, reperfusion 
35 injury, or systemic cytokine damage. 

Tissues that could be regenerated using the present invention include organs 
(e.g., pancreas, liver, mtestine, kidney, skin, endothelium), muscle (smooth, skeletal 
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or cardiac), vascular (including vascular endothelium), nervous, hematopoietic, and 
skeletal (bone, cartilage, tendon, and ligament) tissue. Preferably, regeneration occurs 
without or decreased scarring. Regeneration also may include angiogenesis. 

Moreover, a polynucleotide or polypeptide of the present invention may increase 
5 regeneration of tissues difficult to heal. For example, increased tendon/ligament 
regeneration would quicken recovery time after damage. A polynucleotide or 
polypeptide of the present invention could also be used prophylactically in an effort to 
avoid damage. Specific diseases that could be treated include of tendinitis, carpal tunnel 
syndrome, and other tendon or ligament defects. A further example of tissue 

10 regeneration of non-healing wounds includes pressure ulcers, ulcers associated with 
vascular insufficiency, surgical, and traumatic wounds. 

Similarly, nerve and brain tissue could also be regenerated by using a 
polynucleotide or polypeptide of the present invention to proUferate and differentiate 
nerve cells. Diseases that could be treated using this method include central and 

15 peripheral nervous system diseases, neuropathies, or mechanical and traumatic 
disorders (e.g., spinal cord disorders, head trauma, cerebrovascular disease, and 
stoke). Specifically, diseases associated with peripheral nerve injuries, peripheral 
neuropathy (e.g., resulting from chemotherapy or other medical therapies), localized 
neuropathies, and central nervous system diseases (e.g., Alzheimer's disease, 

20 Parkinson's disease, Huntington's disease, amyotrophic lateral sclerosis, and Shy- 
Drager syndrome), could all be treated using the polynucleotide or polypeptide of the 
present invention. 

Chemotaxis 

25 A polynucleotide or polypeptide of the present invention may have chemotaxis 

activity. A chemotaxic molecule attracts or mobilizes cells (e.g., monocytes, 
fibroblasts, neutrophils, T-cells, mast cells, eosinophils, epithelial and/or endothelial 
cells) to a particular site in the body, such as inflammation, infection, or site of 
hyperproliferation. The mobilized cells can then fight off and/or heal the particular 

30 trauma or abnormaUty. 

A polynucleotide or polypeptide of the present invention may increase 
chemotaxic activity of particular cells. These chemotactic molecules can then be used to 
treat inflanmiation, infection, hyperproliferative disorders, or any inmiune system 
disorder by increasing the number of cells targeted to a particular location in the body. 

35 For example, chemotaxic molecules can be used to treat wounds and other trauma to 
tissues by attracting immune cells to the injured location. Chemotactic molecules of the 
present invention can also attract fibroblasts, which can be used to treat wounds. 
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It is also contemplated that a polynucleotide or polypeptide of the present 
invention may inhibit chemotactic activity. These molecules could also be used to treat 
disorders. Thus, a polynucleotide or polypeptide of the present invention could be used 
as an inhibitor of chemotaxis. 

5 

Binding Activity 

A polypeptide of the present invention may be used to screen for molecules that 
bind to the polypeptide or for molecules to which the polypeptide binds. The binding 
of the polypeptide and the molecule may activate (agonist), increase, inhibit 
10 (antagonist), or decrease activity of the polypeptide or the molecule bound. Examples 
of such molecules include antibodies, oligonucleotides, proteins (e.g., receptors),or 
small molecules. 

Preferably, the molecule is closely related to the natural ligand of the 
polypeptide, e.g., a fragment of the ligand, or a natural substrate, a ligand, a structural 

15 or functional mimetic. (See, Cohgan et al.. Current Protocols in Immunology 

l(2):Chapter 5 (1991).) Similarly, the molecule can be closely related to the natural 
receptor to which the polypeptide binds, or at least, a fragment of the receptor capable 
of being bound by the polypeptide (e.g., active site). In either case, the molecule can 
be rationally designed using known techniques. 

20 Preferably, the screening for these molecules involves producing appropriate 

cells which express the polypeptide, either as a secreted protein or on the cell 
membrane. Preferred cells include cells from mammals, yeast, I>rosophila, or £. colL 
Cells expressing the polypeptide (or cell membrane containing the expressed 
polypeptide) are then preferably contacted with a test compound potentially containing 

25 the molecule to observe binding, stimulation, or inhibition of activity of either the 
polypeptide or the molecule. 

The assay may simply test binding of a candidate compound to the polypeptide, 
wherein binding is detected by a label, or in an assay involving competition with a 
labeled conq)etitor. Further, the assay may test whether the candidate compound results 

30 in a signal generated by binding to the polypeptide. 

Alternatively, the assay can be carried out using cell-free preparations, 
polypeptide/molecule affixed to a solid support, chemical libraries, or natural product 
mixtures. The assay may also simply comprise the steps of mixing a candidate 
compound with a solution containing a polypeptide, measuring polypeptide/molecule 

35 activity or binding, and comparing the polypeptide/molecule activity or binding to a 
standard. 
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Preferably, an ELISA assay can measure polypeptide level or activity in a 
sample (e.g., biological sample) using a monoclonal or polyclonal antibody. The 
antibody can measure polypeptide level or activity by either binding, directly or 
indirectly, to the polypeptide or by competing with the polypeptide for a substrate. 
5 All of these above assays can be used as diagnostic or prognostic markers. The 

molecules discovered using these assays can be used to treat disease or to bring about a 
particular result in a patient (e.g., blood vessel growth) by activating or inhibiting the 
poiypeptide/molecule. Moreover, the assays can discover agents which may inhibit or 
enhance the production of the polypeptide fh)m suitably manipulated cells or tissues. 

10 Therefore, the invention includes a method of identifying compounds which 

bind to a polypeptide of the invention comprising the steps of: (a) incubating a 
candidate binding compound with a polypeptide of the invention; and (b) determining if 
binding has occurred. Moreover, the invention includes a method of identifying 
agonists/antagonists comprising the steps of: (a) incubating a candidate compoimd with 

15 a polypeptide of the invention, (b) assaying a biological activity , and (b) determining if 
a biological activity of the polypeptide has been altered. 

Other Activities 

A polypeptide or polynucleotide of the present invention may also increase or 
20 decrease the differentiation or proliferation of embryonic stem cells, besides, as 

discussed above, hematopoietic lineage. 

A polypeptide or polynucleotide of the present invention may also be used to 

modulate manmialian characteristics, such as body height, weight, hair color, eye color, 

skin, percentage of adipose tissue, pigmentadon, size, and shape (e.g., cosmetic 
25 surgery). Similarly, a polypeptide or polynucleotide of the present invention may be 

used to modulate mammalian metabolism affecting catabolism, anabolism, processing, 

utilization, and storage of energy. 

A polypeptide or polynucleotide of the present invention may be used to change 

a mammal's mental state or physical state by influencing biorhythms, caricadic 
30 rhythms, depression (including depressive disorders), tendency for violence, tolerance 

for pain, reproducdve capabilities (preferably by Activin or Inhibin-like activity), 

hormonal or endocrine levels, appetite, libido, memory, stress, or other cognitive 

qualities. 

A polypeptide or polynucleotide of the present invention may also be used as a 
35 food additive or preservative, such as to increase or decrease storage capabilities, fat 
content, Upid, protein, carbohydrate, vitamins, minerals, cofactors or other nutritional 
components. 
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Other Preferred Embodiments 

Other preferred embodiments of the claimed invention include an isolated 
nucleic acid molecule comprising a nucleotide sequence which is at least 95% identical 
5 to a sequence of at least about 50 contiguous nucleotides in the nucleotide sequence of 
SEQ ID NO:X wherein X is any integer as defined m Table 1. 

Also preferred is a nucleic acid molecule wherein said sequence of contiguous 
nucleotides is included in the nucleotide sequence of SEQ ID NO:X in the range of 
positions beginning with the nucleotide at about the position of the 5* Nucleotide of the 
10 Clone Sequence and ending with the nucleotide at about die position of the 3' 
Nucleotide of the Clone Sequence as defined for SEQ ID NO:X in Table 1 . 

Also preferred is a nucleic acid molecule wherein said sequence of contiguous 
nucleotides is included in the nucleotide sequence of SEQ ID NO:X in the range of 
positions beginning with the nucleotide at about the position of the 5' Nucleotide of the 
15 Start Codon and ending with the nucleotide a about the position of the 3* Nucleotide of 
die Clone Sequence as defined for SEQ ID NO:X in Table 1. 

Similarly preferred is a nucleic acid molecule wherein said sequence of 
contiguous nucleotides is included in the nucleotide sequence of SEQ ID NO:X in the 
range of positions beginning with the nucleotide at about the position of the 5' 
20 Nucleotide of the First Amino Acid of the Signal Peptide and ending with the nucleotide 
at about the position of the 3' Nucleotide of the Clone Sequence as defined for SEQ ID 
NO:X in Table 1. 

Also preferred is an isolated nucleic acid molecule comprising a nucleotide 
sequence which is at least 95% identical to a sequence of at least about 150 contiguous 
25 nucleotides in the nucleotide sequence of SEQ E) NO:X. 

Further preferred is an isolated nucleic acid molecule comprising a nucleotide 
sequence which is at least 95% identical to a sequence of at least about 500 contiguous 
nucleotides in the nucleotide sequence of SEQ ID NO:X. 

A further preferred embodiment is a nucleic acid molecule comprising a 
30 nucleotide sequence which is at least 95% identical to the nucleotide sequence of SEQ 
ID NO:X beginning with the nucleotide at about the position of the 5' Nucleotide of the 
First Amino Acid of the Signal Peptide and ending with the nucleotide at about the 
position of the 3' Nucleotide of the Clone Sequence as defined for SEQ ID NO:X in 
Table 1. 
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A further preferred embodiment is an isolated nucleic acid molecule comprising 
a nucleotide sequence which is at least 95% identical to the complete nucleotide 
sequence of SEQ ID NO:X. 

Also preferred is an isolated nucleic acid molecule which hybridizes under 
5 stringent hybridization conditions to a nucleic acid molecule, wherein said nucleic acid 
molecule which hybridizes does not hybridize under stringent hybridization conditions 
to a nucleic acid molecule having a nucleotide sequence consisting of only A residues or 
of only T residues. 

Also preferred is a composition of matter comprising a DNA molecule which 
1 0 comprises a human cDNA clone identified by a cDNA Clone Identifier in Table 1 , 
which DNA molecule is contained in the material deposited with the American Type 
Culture Collection and given the ATCC Deposit Number shown in Table 1 for said 
cDNA Clone Identifier. 

Also preferred is an isolated nucleic acid molecule comprising a nucleotide 
15 sequence which is at least 95% identical to a sequence of at least 50 contiguous 

nucleotides in the nucleotide sequence of a human cDNA clone identified by a cDNA 
Clone Identifier in Table 1, which DNA molecule is contained in the deposit given the 
ATCC Deposit Number shown in Table 1. 

Also preferred is an isolated nucleic acid molecule, wherein said sequence of at 
20 least 50 contiguous nucleotides is included in the nucleotide sequence of the complete 
open reading frame sequence encoded by said human cDNA clone. 

Also preferred is an isolated nucleic acid molecule comprising a nucleotide 
sequence which is at least 95% identical to sequence of at least 150 contiguous 
nucleotides in the nucleotide sequence encoded by said human cDNA clone. 
25 A further preferred embodiment is an isolated nucleic acid molecule comprising 

a nucleotide sequence which is at least 95% identical to sequence of at least 500 
contiguous nucleotides in the nucleotide sequence encoded by said human cDNA clone. 

A further preferred embodiment is an isolated nucleic acid molecule comprising 
a nucleotide sequence which is at least 95% identical to the complete nucleotide 
30 sequence encoded by said human cDNA clone. 

A further preferred embodiment is a method for detecting in a biological sample 
a nucleic acid molecule comprising a nucleotide sequence which is at least 95% identical 
to a sequence of at least 50 contiguous nucleotides in a sequence selected firom the 
group consisting of: a nucleotide sequence of SEQ ID NO:X wherein X is any mteger 
35 as defined in Table 1 ; and a nucleotide sequence encoded by a human cDNA clone 
identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with the 
ATCC Deposit Number shown for said cDNA clone in Table I ; which method 
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comprises a step of comparing a nucleotide sequence of at least one nucleic acid 
molecule in said sample with a sequence selected from said group and determining 
whether the sequence of said nucleic acid molecule in said sample is at least 95% 
identical to said selected sequence. 
5 Also preferred is the above method wherein said step of comparing sequences 

comprises determining the extent of nucleic acid hybridization between nucleic acid 
molecules in said sample and a nucleic acid molecule comprising said sequence selected 
from said group. Similarly, also preferred is the above method wherein said step of 
comparing sequences is performed by comparing the nucleotide sequence determined 

10 from a nucleic acid molecule in said sample with said sequence selected from said 
group. The nucleic acid molecules can comprise DNA molecules or RNA molecules. 

A further preferred embodiment is a method for identifying the species, tissue or 
cell type of a biological sample which method comprises a step of detecting nucleic acid 
molecules in said sample, if any, comprising a nucleotide sequence that is at least 95% 

15 identical to a sequence of at least 50 contiguous nucleotides in a sequence selected from 
the group consisting of: a nucleotide sequence of SEQ ID NO:X wherein X is any 
integer as defined in Table 1 ; and a nucleotide sequence encoded by a human cDN A 
clone identified by a cDNA Qone Identifier in Table 1 and contained in the deposit with 
the ATCC Deposit Number shown for said cDNA clone in Table 1. 

20 The method for identifying the species, tissue or cell type of a biological sample 

can comprise a step of detecting nucleic acid molecules comprising a nucleotide 
sequence in a panel of at least two nucleotide sequences, wherein at least one sequence 
in said panel is at least 95% identical to a sequence of at least 50 contiguous nucleotides 
in a sequence selected from said group. 

25 Also preferred is a method for diagnosing in a subject a pathological condition 

associated with abnormal structure or expression of a gene encoding a secreted protein 
identified in Table U which method comprises a step of detecting in a biological sample 
obtained from said subject nucleic acid molecules, if any, comprising a nucleotide 
sequence that is at least 95% identical to a sequence of at least 50 contiguous 

30 nucleotides in a sequence selected from the group consisting of: a nucleotide sequence 
of SEQ ID NO:X wherein X is any integer as defined in Table 1 ; and a nucleotide 
sequence encoded by a human cDN A clone identified by a cDNA Clone Identifier in 
Table 1 and contained in the deposit with the ATCC Deposit Number shown for said 
cDNA clone in Table 1. 

35 The method for diagnosing a pathological condition can comprise a step of 

detecting nucleic acid molecules comprising a nucleotide sequence in a panel of at least 
two nucleotide sequences, wherein at least one sequence in said panel is at least 95% 
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identical to a sequence of at least 50 contiguous nucleotides in a sequence selected from 
said group. 

Also preferred is a composition of matter comprising isolated nucleic acid 

molecules wherein the nucleotide sequences of said nucleic acid molecules comprise a 
5 panel of at least two nucleotide sequences, wherein at least one sequence in said panel is 

at least 95% identical to a sequence of at least 50 contiguous nucleotides in a sequence 

selected from the group consisting of: a nucleotide sequence of SEQ ID NO:X wherein 

X is any integer as defined in Table 1; and a nucleotide sequence encoded by a human 

cDN A clone identified by a cDNA Clone Identifier in Table 1 and contained in the 
10 deposit with the ATCC Deposit Number shown for said cDNA clone in Table 1. The 

nucleic acid molecules can comprise DNA molecules or RNA molecules. 

Also preferred is an isolated polypeptide comprising an amino acid sequence at 

least 90% identical to a sequence of at least about 10 contiguous amino acids in the 

amino acid sequence of SEQ ID NO: Y wherein Y is any integer as defined in Table 1 . 
15 Also preferred is a polypeptide, wherein said sequence of contiguous amino 

acids is included in the amino acid sequence of SEQ ID NO: Y in the range of positions 

beginning with the residue at about the position of the Furst Amino Acid of the Secreted 

Portion and ending with the residue at about the Last Amino Acid of the Open Reading 

Frame as set forth for SEQ ID NO: Y in Table 1 . 
20 Also preferred is an isolated polypeptide comprising an amino acid sequence at 

least 95% identical to a sequence of at least about 30 contiguous amino acids in the 

amino acid sequence of SEQ ID NO: Y. 

Further preferred is an isolated polypeptide comprising an amino acid sequence 

at least 95% identical to a sequence of at least about 100 contiguous amino acids in the 
25 amino acid sequence of SEQ ID NO: Y. 

Further preferred is an isolated polypeptide comprising an amino acid sequence 

at least 95% identical to the con^)Iete amino acid sequence of SEQ ID NO: Y. 

Further preferred is an isolated polypeptide comprising an amino add sequence 

at least 90% identical to a sequence of at least about 10 contiguous amino acids in the 
30 complete amino acid sequence of a secreted protein encoded by a human cDN A clone 

identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with the 

ATCC I>eposit Number shown for said cDNA clone in Table 1 . 

Also preferred is a polypeptide wherein said sequence of contiguous amino 

acids is included in the amino acid sequence of a secreted portion of the secreted protein 
35 encoded by a human cDNA clone identified by a cDNA Qone Identifier in Table 1 and 

contained in the deposit with the ATCC Deposit Number shown for said cDN A clone in 

Table 1. 
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Also preferred is an isolated polypeptide comprising an amino acid sequence at 
least 95% identical to a sequence of at least about 30 contiguous amino acids in the 
amino acid sequence of the secreted portion of the protein encoded by a human cDN A 
clone identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with 
5 the ATCC Deposit Niunber shown for said cDN A clone in Table 1 . 

Also preferred is an isolated polypeptide comprising an amino acid sequence at 
least 95% identical to a sequence of at least about 100 contiguous amino acids in the 
amino acid sequence of the secreted portion of the protein encoded by a human cDN A 
clone identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with 
10 the ATCC Deposit Number shown for said cDN A clone in Table 1 . 

Also preferred is an isolated polypeptide comprising an amino acid sequence at 
least 95% identical to the amino acid sequence of the secreted portion of the protein 
encoded by a human cDNA clone identified by a cDNA Clone Identifier in Table 1 and 
contained in the deposit with the ATCC Deposit Number shown for said cDNA clone in 
15 Table 1. 

Fiuther preferred is an isolated antibody which binds specifically to a 
polypeptide comprising an amino acid sequence that is at least 90% identical to a 
sequence of at least 10 contiguous amino acids in a sequence selected from the group 
consisting of: an amino acid sequence of SEQ ID NO: Y wherein Y is any integer as 

20 defined in Table 1 ; and a complete amino acid sequence of a protein encoded by a 

human cDNA clone identified by a cDNA Clone Identifier in Table 1 and contained in 
the deposit with the ATCC Deposit Number shown for said cDNA clone in Table 1 . 

Further preferred is a method for detecting in a biological sample a polypeptide 
comprising an amino acid sequence which is at least 90% identical to a sequence of at 

25 least 10 contiguous amino acids in a sequence selected from the group consisting of: an 
amino acid sequence of SEQ ID NO: Y wherein Y is any integer as defined in Table 1 ; 
and a complete amino acid sequence of a protein encoded by a human cDN A clone 
identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with the 
ATCC Deposit Number shown for said cDN A clone in Table 1 ; which method 

30 comprises a step of comparing an amino acid sequence of at least one polypeptide 
molecule in said sample with a sequence selected from said group and determining 
whether the sequence of said polypeptide molecule in said sample is at least 90% 
identical to said sequence of at least 10 contiguous amino acids. 

Also preferred is the above method wherein said step of comparing an amino 

35 acid sequence of at least one polypeptide molecule in said sample with a sequence 
selected from said group comprises determining the extent of specific binding of 
polypeptides in said sample to an antibody which binds specifically to a polypeptide 
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comprising an amino acid sequence that is at least 90% identical to a sequence of at least 
10 contiguous amino acids in a sequence selected from the group consisting of: an 
amino acid sequence of SEQ ID NO:Y wherein Y is any integer as defined in Table 1; 
and a complete amino acid sequence of a protein encoded by a human cDN A clone 
5 identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with the 
ATCC Deposit Number shown for said cDNA clone in Table 1 . 

Also preferred is the above method wherein said step of comparing sequences is 
performed by comparing the amino acid sequence determined from a polypeptide 
molecule in said sample with said sequence selected from said group. 

10 Also preferred is a method for identifying the species, tissue or cell type of a 

biological sample which method comprises a step of detecting polypeptide molecules in 
said sample, if any, comprising an amino acid sequence that is at least 90% identical to 
a sequence of at least 10 contiguous amino acids in a sequence selected fix)m the group 
consisting of: an amino acid sequence of SEQ ID NO: Y wherein Y is any integer as 

15 defined in Table 1; and a complete amino acid sequence of a secreted protein encoded 
by a human cDN A clone identified by a cDN A Clone Identifier in Table 1 and contained 
in the deposit with the ATCC Deposit Number shown for said cDNA clone in Table 1. 

Also prefened is the above method for identifying the species, tissue or cell type 
of a biological sample, which method comprises a step of detecting polypeptide 

20 molecules comprising an amino acid sequence in a panel of at least two amino acid 
sequences, wherein at least one sequence in said panel is at least 90% identical to a 
sequence of at least 10 contiguous amino acids in a sequence selected firom the above 
group. 

Also preferred is a method for diagnosing in a subject a pathological condition 
25 associated with abnormal structure or expression of a gene encoding a secreted protein 
identified in Table 1, which method comprises a step of detecting in a biological sample 
obtained from said subject polypeptide molecules comprising an amino acid sequence in 
a panel of at least two amino acid sequences, wherein at least one sequence in said panel 
is at least 90% identical to a sequence of at least 10 contiguous amino acids in a 
30 sequence selected from the group consisting of: an amino acid sequence of SEQ ID 
NO:Y wherein Y is any integer as defined in Table 1; and a complete amino acid 
sequence of a secreted protein encoded by a human cDNA clone identified by a cDNA 
Clone Identifier in Table 1 and contained in the deposit with the ATCC Deposit Number 
shown for said cDN A clone in Table 1 . 
35 In any of these methods, the step of detecting said polypeptide molecules 

includes using an antibody. 



wo 98/54963 



PCT/US98/n422 



223 



Also preferred is an isolated nucleic acid molecule comprising a nucleotide 
sequence which is at least 95% identical to a nucleotide sequence encoding a 
polypeptide wherein said polypeptide comprises an amino acid sequence that is at least 
90% identical to a sequence of at least 10 contiguous amino acids in a sequence selected 
5 from the group consisting of: an amino acid sequence of SEQ ID NO: Y wherein Y is 
any integer as defined in Table 1 ; and a complete amino acid sequence of a secreted 
protein encoded by a human cDNA clone identified by a cDNA Clone Identifier in Table 
1 and contained in the deposit widi the ATCC Deposit Number shown for said cDNA 
clone in Table L 

10 Also preferred is an isolated nucleic acid molecule, wherein said nucleotide 

sequence encoding a polypeptide has been optimized for expression of said polypeptide 
in a prokaryotic host. 

Also preferred is an isolated nucleic acid molecule, wherein said polypeptide 
comprises an amino acid sequence selected from the group consisting of: an amino acid 

15 sequence of SEQ ID NO: Y wherein Y is any integer as defined in Table 1 ; and a 

complete amino acid sequence of a secreted protein encoded by a human cDN A clone 
identified by a cDNA Qone Identifier in Table 1 and contained in the deposit with the 
ATCC Deposit Number shown for said cDN A clone in Table 1 . 

Further preferred is a method of making a recombinant vector comprising 

20 inserting any of the above isolated nucleic acid molecule into a vector. Also preferred is 
the recombinant vector produced by this method. Also preferred is a method of making 
a recombinant host cell comprising introducing the vector into a host ceU, as well as the 
recombinant host cell produced by this method. 

Also preferred is a method of making an isolated polypeptide comprising 

25 cultuhng this recombinant host cell under conditions such that said polypeptide is 

expressed and recovering said polypeptide. Also preferred is this method of making an 
isolated polypeptide, wherein said recombinant host cell is a eukaryotic cell and said 
polypeptide is a secreted portion of a human secreted protein comprising an amino acid 
sequence selected from the group consisting of: an amino acid sequence of SEQ ID 

30 NO: Y beginning with the residue at the position of the First Amino Acid of the Secreted 
Portion of SEQ ID NO: Y wherein Y is an integer set forth in Table 1 and said position 
of the First Amino Acid of the Secreted Portion of SEQ ID NO: Y is defined in Table 1; 
and an amino acid sequence of a secreted portion of a protein encoded by a human 
cDN A clone identified by a cDNA Clone Identifier in Table 1 and contained in the 

35 deposit with the ATCC Deposit Number shown for said cDN A clone in Table 1 . The 
isolated polypeptide produced by this method is also preferred. 
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Also preferred is a method of treatment of an individual in need of an increased 
level of a secreted protein activity, which method comprises administering to such an 
individual a pharmaceutical composition comprising an amount of an isolated 
polypeptide, polynucleotide, or antibody of the claimed invention effective to increase 
5 the level of said protein activity in said individual. 

Having generally described the invention, the same will be more readily 
understood by reference to the following examples, which arc provided by way of 
illustration and are not intended as limiting. 

10 Examples 

Example 1: Isolation of a Selected cDNA Clone From the Deposited 
Sample 

Each cDNA clone in a cited ATCC deposit is contained in a plasmid vector. 
1 5 Table 1 identifies the vectors used to construct the cDNA library from which each clone 
was isolated. In many cases, the vector used to construct the library is a phage vector 
from which a plasmid has been excised. The table immediately below correlates the 
related plasmid for each phage vector used in constructing the cDNA library. For 
example, where a particular clone is identified in Table 1 as being isolated in the vector 
20 "Lambda Zap," the corresponding deposited clone is in "pBluescript." 

Vector Used to Construct Library Corresponding Deposited Plasmid 

Lambda Zap pBluescript (pBS) 

Uni-Zap XR pBluescript (pBS) 

Zap Express pBK 
25 lafinid BA plafinid BA 

pSportl pSportl 
pCMVSport 2.0 pCMVSport 2.0 

pCMVSport 3.0 pCMVSport 3.0 

pCR®2.1 pCR®2.1 
30 Vectors Lambda Zap (U.S. Patent Nos. 5, 128,256 and 5,286,636), Uni-Zap 

XR (U.S. Patent Nos. 5,128, 256 and 5,286,636), Zap Express (U.S. Patent Nos. 
5,128,256 and 5,286,636), pBluescript (pBS) (Short, J. M. et al.. Nucleic Acids Res. 
16:7583-7600 (1988); Alting-Mees, M. A. and Short, J. M., Nucleic Acids Res. 
17:9494 (1989)) and pBK (Alting-Mees, M. A. et al.. Strategies 5:58-61 (1992)) are 
35 commercially available from Stratagene Cloning Systems, Inc., 1 101 1 N. Torrey Pines 
Road, La Jolla, CA, 92037. pBS contains an ampicillin resistance gene and pBK 
contains a neomycin resistance gene. Both can be transformed into E. coli strain XL-1 
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Blue, also available from Stratagene. pBS comes in 4 forms SK+, SK-, KS+ and KS. 
The S and K refers to the orientation of the poly linker to the T7 and T3 primer 
sequences which flank the polylinker region ("S" is for Sad and "K" is for Kpnl which 
are the first sites on each respective end of the linker). or refer to the orientation 
5 of the f 1 origin of replication ("ori"), such that in one orientation, single stranded rescue 
initiated from the f 1 on generates sense strand DNA and in the other, antisense. 

Vectors pSportl, pCMVSport 2.0 and pCMVSport 3.0, were obtained from 
Life Technologies, Inc., P. O. Box 6009. Gaithersburg, MD 20897. All Sport vectors 
contain an ampicillin resistance gene and may be transformed into E. coli strain 

10 DHIOB, also available from Life Technologies. (See, for instance, Gruber, C. E., et 
al.. Focus 15:59 (1993).) Vector lafmid BA (Bento Soares, Columbia University, NY) 
contains an ampicillin resistance gene and can be transformed into E. coli strain XL-1 
Blue. Vector pCR®2.1, which is available from Invitrogen, 1600 Faraday Avenue, 
Carlsbad, CA 92(X)8, contains an ampicillin resistance gene and may be transformed 

15 into E. coli strain DHIOB, available from Life Technologies. (See, for instance, Clark, 
J. M., Nuc. Acids Res. 16:9677-9686 (1988) and Mead, D. et al., Bio/Technology 9: 
(1991).) Preferably, a polynucleotide of the present invention does not comprise the 
phage vector sequences identified for the particidar clone in Table I, as well as the 
corresponding plasmid vector sequences designated above. 

20 The deposited material in the sample assigned the ATCC Deposit Number cited 

in Table 1 for any given cDNA clone also may contain one or more additional plasmids, 
each comprising a cDNA clone different from that given clone. Thus, deposits sharing 
the same ATCC Deposit Number contain at least a plasmid for each cDN A clone 
identified in Table 1. Typically, each ATCC deposit sample cited in Table 1 comprises 

25 a mixture of approximately equal amounts (by weight) of about 50 plasmid DN As, each 
containing a different cDNA clone; but such a deposit sample may include plasmids for 
more or less than 50 cDN A clones, up to about 500 cDNA clones. 

Two approaches can be used to isolate a particular clone from the deposited 
sample of plasmid DNAs cited for that clone in Table L First, a plasmid is direcdy 

30 isolated by screening the clones using a polynucleotide probe corresponding to SEQ ID 
NO:X. 

Particularly, a specific polynucleotide with 30-40 nucleotides is synthesized 
using an Applied Biosystems DNA synthesizer according to the sequence reported. 

The oligonucleotide is labeled, for instance, with ^^P-7-ATP using T4 polynucleotide 

35 kinase and purified according to routine methods. (E.g., Maniatis et al.. Molecular 
Cloning: A Laboratory Manual, Cold Spring Harbor Press, Cold Spring, NY (1982).) 
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The plasmid mixture is transformed into a suitable host, as indicated above (such as 
XL-1 Blue (Stratagene)) using techniques known to those of skill in the art, such as 
those provided by the vector supplier or in related publications or patents cited above. 
The transformants are plated on 1 .5% agar plates (containing the appropriate selection 
5 agent, e.g., ampicillin) to a density of about 150 transformants (colonies) per plate. 
These plates are screened using Nylon membranes according to routine methods for 
bacterial colony screening (e.g., Sambrook et al.. Molecular Cloning: A Laboratory 
Manual, 2nd Edit., (1989), Cold Spring Harbor Laboratory Press, pages 1.93 to 
1 . 104). or other techniques known to those of skill in the art. 

10 Alternatively, two primers of 17-20 nucleotides derived from both ends of the 

SEQ ID NO:X (i.e., within the region of SEQ ID NO:X bounded by the 5' NT and the 
3* NT of the clone defined in Table 1 ) are synthesized and used to amplify the desired 
cDNA using the deposited cDNA plasmid as a template. The polymerase chain reaction 
is carried out under routine conditions, for instance, in 25 ^1 of reaction mixture with 

15 0.5 ug of the above cDNA template. A convenient reaction mixture is 1 .5-5 mM 

MgClj, 0.01 % (w/v) gelatin, 20 ^M each of dATP, dCTP, dGTP, dTTP, 25 pmol of 
each primer and 0.25 Unit of Taq polymerase. Thirty five cycles of PCR (denaturation 
at 94''C for 1 min; annealing at 55X for 1 min; elongation at 72°C for 1 min) are 

performed with a Perkin-Ehner Cetus automated thermal cycler. The amplified product 

20 is analyzed by agarose gel electrophoresis and the DNA band with expected molecular 
weight is excised and purified. The PCR product is verified to be the selected sequence 
by subcloning and sequencing the DNA product 

Several methods are available for the identification of the 5' or 3' non-coding 
portions of a gene which may not be present in the deposited clone. These methods 

25 include but are not limited to, filter probing, clone enrichment using specific probes, 
and protocols similar or identical to 5* and 3* "RACE" protocols which are well known 
in the art For instance, a method similar to 5* RACE is available for generating the 
missing 5* end of a desired fiill-length transcript. (Fromont-Racine et al.. Nucleic Acids 
Res. 21(7):1683-1684 (1993).) 

30 Briefly, a specific RNA oligonucleotide is ligated to the 5' ends of a population 

of RNA presumably containing fiill-length gene RNA transcripts. A primer set 
containing a primer specific to the ligated RNA oligonucleotide and a primer specific to 
a known sequence of the gene of interest is used to PCR amplify the 5' portion of the 
desired fiill-length gene. This amplified product may then be sequenced and used to 

35 generate the fiill length gene. 
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This above method starts with total RNA isolated from the desired source, 
although poly-A+ RNA can be used. The RNA preparation can then be treated with 
phosphatase if necessary to eliminate 5* phosphate groups on degraded or damaged 
RNA which may interfere with the later RNA ligase step. The phosphatase should dien 
5 be inactivated and the RNA treated with tobacco acid pyrophosphatase in order to 
remove the cap structure present at the 5* ends of messenger RNAs. This reaction 
leaves a 5' phosphate group at the 5' end of the cap cleaved RNA which can then be 
ligated to an RNA oligonucleotide using T4 RNA ligase. 

This modified RNA preparation is used as a template for first strand cDNA 
10 synthesis using a gene specific oligonucleotide. The first strand synthesis reaction is 
used as a template for PCR amplification of the desired 5* end using a primer specific to 
the ligated RNA oligonucleotide and a primer specific to the known sequence of the 
gene of interest. The resultant product is then sequenced and analyzed to confirm that 
the 5* end sequence belongs to the desired gene. 

15 

Example 2: Isolation of Genomic Clones Corresponding to a 
Polynucleotide 

A human genomic PI Ubrary (Genomic Systems, Inc.) is screened by PCR 
using primers selected for the cDNA sequence corresponding to SEQ ID NO:X., 
20 according to die method described in Example 1. (See also, Sambrook.) 

Example 3: Tissue Distribution of Polypeptide 

Tissue distribution of mRNA expression of polynucleotides of the present 
invention is determined usmg protocols for Northern blot analysis, described by, 

25 among others, Sambrook et al. For example, a cDNA probe produced by the method 
described in Example 1 is labeled with P^ using the rediprime™ DNA labeling system 
(Amersham Life Science), according to manufacturer's instructions. After labeling, the 
probe is purified using CHROMA SPIN-100™ column (Clontech Laboratories, Inc.), 
according to manufacturer s protocol number Fri200-1. The purified labeled probe is 

30 then used to examine various human tissues for mRNA expression. 

Multiple Tissue Northern (MTN) blots containing various human tissues (H) or 
human immune system tissues (IM) (Clontech) are examined with the labeled probe 
using ExpressHyb™ hybridization solution (Clontech) according to manufacturer's 
protocol number FT 1 1 90- 1 . Following hybridization and washing, the blots are 

35 mounted and exposed to fihn at -70^C overnight, and the films developed according to 

standard procedures. 
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Example 4: Chromosomal Mapping of the Polynucleotides 

An oligonucleotide primer set is designed according to the sequence at the 5* 
end of SEQ ID NO:X. This primer preferably spans about 100 nucleotides. This 
5 primer set is then used m a polymerase chain reaction under the following set of 

conditions : 30 seconds, 95°C; 1 minute, Se^'C; 1 minute, TO^'C. This cycle is repeated 

32 times followed by one 5 minute cycle at 70°C. Human, mouse, and hamster DNA 

is used as template in addition to a somatic cell hybrid panel containing individual 
chromosomes or chromosome fragments (Bios, Inc). The reactions is analyzed on 
10 either 8% polyacrylamide gels or 3.5 % agarose gels. Chromosome mapping is 

determined by the presence of an approximately 100 bp PGR fragment in the particular 
somatic cell hybrid. 

Example 5: Bacterial Expression of a Polypeptide 

1 5 A polynucleotide encoding a polypeptide of the present invention is amphfied 

using PGR oligonucleotide primers corresponding to the 5' and 3' ends of the DNA 
sequence, as outiined in Example 1, to synthesize insertion fragments. The primers 
used to amplify the cDNA insert should preferably contain restriction sites, such as 
BamHI and Xbal, at the 5* end of the primers in order to clone the amphfied product 

20 into the expression vector. For example, BamHI and Xbal correspond to the restriction 
enzyme sites on the bacterial expression vector pQE-9. (Qiagen, Inc., Chatsworth, 
CA). This plasmid vector encodes antibiotic resistance (AmpO, a bacterial origin of 
replication (ori), an IPTG-regulatable promoter/operator (P/O), a ribosome binding site 
(RBS), a 6-histidine tag (6-His), and restriction enzyme cloning sites. 

25 The pQE-9 vector is digested with BamHI and Xbal and the amplified firagment 

is ligated into the pQE-9 vector maintaining the reading frame initiated at the bacterial 
RBS. The hgation mixture is then used to transform the E. coli strain M15/rep4 
(Qiagen, Inc.) which contains multiple copies of the plasmid pREP4, which expresses 
the laci repressor and also confers kanamycin resistance (KanO- Transformants are 

30 identified by their ability to grow on LB plates and ampiciUin/kanamycin resistant 

colonies are selected. Plasmid DNA is isolated and confirmed by restriction analysis. 

Clones containing the desired constructs are grown overnight (O/N) in Uquid 
culture in LB media supplemented with both Amp (100 ug/ml) and Kan (25 ug/ml). 
The O/N culture is used to inoculate a large culture at a ratio of 1:100 to 1:250. The 

35 cells are grown to an optical density 600 (O.D.^ of between 0.4 and 0.6. IPTG 
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(Isopropyl-B-D-thiogalacto pyranoside) is then added to a final concentration of 1 mM. 
IPTG induces by inactivating the laci repressor, clearing the P/O leading to increased 
gene expression. 

Cells are grown for an extra 3 to 4 hours. Cells are then harvested by 
5 centrifugation (20 mins at 6000Xg). The cell pellet is solubilized in the chaotropic 

agent 6 Molar Guanidine HCl by stirring for 3-4 hours at 4°C, The cell debris is 

removed by centrifugation, and the supernatant containing the polypeptide is loaded 
onto a nickel-nitrilo-tri-acetic acid ("Ni-NTA") affinity resin column (available from 
QIAGEN, Inc., supra). Proteins with a 6 x His tag bind to the Ni-NTA resin with high 

10 affinity and can be purified in a simple one-step procedure (for details see: The 
QIAexpressionist (1995) QIAGEN, Inc., supra). 

Briefly, the supematant is loaded onto the column in 6 M guanidine-HCl, pH 8, 
the column is first washed with 10 volumes of 6 M guanidine-HCl, pH 8, then Washed 
with 10 volumes of 6 M guanidine-HCl pH 6, and finally the polypeptide is eluted with 

15 6 M guanidine-HCl, pH 5. 

The purified protein is then renatured by dialyzing it against phosphate-buffered 
saline (PBS) or 50 mM Na-acetate, pH 6 buffer plus 200 mM NaCl. Alternatively, the 
protein can be successfully refolded while immobilized on the Ni-NTA colunm. The 
recommended conditions are as follows: renature using a linear 6M- IM urea gradient in 

20 500 mM NaCl, 20% glycerol, 20 mM Tris/HCl pH 7.4, containing protease inhibitors. 
The renaturation should be performed over a period of 1 .5 hours or more. After 
rraaturation the proteins arc eluted by the addition of 250 mM immidazole. Inunidazole 
is removed by a final dialyzing step against PBS or 50 mM sodium acetate pH 6 buffer 
plus 200 mM NaCl. The purified protein is stored at 4** C or fi^ozen at -80° C. 

25 In addition to the above expression vector, the present invention further includes 

an expression vector comprising phage operator and promoter elements operatively 
linked to a polynucleotide of the present invention, called pHE4a. (ATCC Accession 
Number 209645, deposited on February 25, 1998.) This vector contains: 1) a 
neomycinphosphotransferase gene as a selection niarker, 2) an E. coli origin of 

30 replication, 3) a T5 phage promoter sequence, 4) two lac operator sequences, 5) a 

Shine-Delgamo sequence, and 6) the lactose operon repressor gene (laclq). The origin 
of replication (oriQ is derived from pUC19 (LTI, Gaithersburg, MD). The promoter 
sequence and operator sequences are made synthetically. 

DNA can be inserted into the pHEa by restricting the vector with Ndel and 

35 Xbal, BamHI, Xhol, or Asp718, running the restricted product on a gel, and isolating 
the larger fragment (the stuffer firagment should be about 3 10 base pairs). The DNA 
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insert is generated according to the PCR protocol described in Example 1, using PCR 
primers having restriction sites for Ndel (5* primer) and Xbal, BamHI, Xhol, or 
Asp718 (3' primer). The PCR insert is gel purified and restricted with compatible 
enzymes. The insert and vector are ligated according to standard protocols. 
5 The engineered vector could easily be substituted in the above protocol to 

express protein in a bacterial system. 

Example 6: Purification of a Polypeptide from an Inclusion Body 

The following alternative method can be used to purify a polypeptide expressed 
10 in E coli when it is present in the form of inclusion bodies. Unless otherwise specified, 

all of the following steps are conducted at 4-10*'C. 

Upon completion of the production phase of the £. coli fermentation, the cell 

culture is cooled to 4-10°C and the cells harvested by continuous centrifugation at 

15,000 rpm (Heraeus Sepatech). On the basis of the expected yield of protein per unit 
IS weight of cell paste and the amount of purified protein reqiiired, an appropriate amount 
of cell paste, by weight, is suspended in a buffer solutim containing 1(X) mM Tris, 50 
mM EDTA, pH 7.4. The cells are dispersed to a homogeneous suspension using a 
high shear mixer. 

The cells are then lysed by passing the solution through a microfluidizer 
20 (Microfuidics, Corp, or APV Gaulin, Inc.) twice at 4000-6000 psi. The homogenate is 
then mixed with NaCl solution to a final concentration of 0.5 M NaCl, followed by 
centrifugation at 7000 xg for 15 min. The resultant pellet is washed again using 0.5M 
NaCl, 100 mM Tris, 50 mM EDTA, pH 7.4. 

The resulting washed inclusion bodies are solubilized with 1.5 M guanidine 
25 hydrochloride (GuHCl) for 2-4 hours. After 7000 xg centrifugation for 15 min., the 

pellet is discarded and the polypeptide containing supernatant is incubated at 4^C 

overnight to allow further GuHCl extraction. 

Following high speed centrifugation (30,000 xg) to remove insoluble particles, 
the GuHQ solubilized protein is refolded by quickly mixing the GuHCl extract with 20 
30 volumes of buffer containing 50 mM sodium, pH 4.5, 1 50 mM NaCl, 2 mM EDTA by 

vigorous stirring. The refolded diluted protein solution is kept at 4**C without mixing 

for 12 hours prior to further purification steps. 

To clarify the refolded polypeptide solution, a previously prepared tangential 

filtration unit equipped with 0. 16 ^im membrane filter with appropriate surface area 
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(e.g., Filtron), eqmlibrated with 40 mM sodium acetate, pH 6.0 is employed. The 
filtered sample is loaded onto a cation exchange resin (e.g., Poros HS-50, Perseptive 
Biosystems). The column is washed with 40 mM sodium acetate, pH 6.0 and eluted 
with 250 mM, 500 mM, 1000 mM, and 1500 mM NaCl in the same buffer, in a 
5 stepwise manner. The absorbance at 280 rnn of the effluent is continuously monitored 
Fractions are collected and further analyzed by SDS-PAGE. 

Fractions containing the polypeptide are then pooled and mixed with 4 volumes 
of water. The diluted sample is then loaded onto a previously prepared set of tandem 
colunms of strong anion (Poros HQ-50, Perseptive Biosystems) and weak anion 

10 (Poros CM-20, Perseptive Biosystems) exchange resins. The columns are equilibrated 
with 40 mM sodium acetate, pH 6.0. Both columns are washed with 40 mM sodium 
acetate, pH 6.0, 200 mM NaCl. The CM-20 column is then eluted using a 10 column 
volume linear gradient rangmg from 0.2 M NaCl, 50 mM sodium acetate, pH 6.0 to 1 .0 
M NaCl, 50 mM sodium acetate, pH 6.5. Fractions are collected under constant Al^ 

15 monitoring of the effluent. Fractions containing the polypeptide (determined, for 
instance, by 16% SDS-PAGE) arc then pooled. 

The resultant polypeptide should exhibit greaterthan 95% purity after the above 
refolding and purification steps. No major contaminant bands should be observed from 

Commassie blue stained 16% SDS-PAGE gel when 5 jig of purified protein is loaded. 

20 The purified protein can also be tested for endotoxin/LPS contamination, and typically 
the LPS content is less than 0.1 ng/ml according to LAL assays. 



Example 7: Cloning and Expression of a Polypeptide in a Baculovirus 
Expression System 

25 In this exan:q)le, the plasmid shuttle vector pA2 is used to insert a polynucleotide 

into a baculovirus to express a polypeptide. This expression vector contains the strong 
polyhedrin promoter of die Autographa califomica nuclear polyhedrosis virus 
(AcMNPV) followed by convenient restriction sites such as BamHI, Xba I and 
Asp718. The polyadenylation site of the simian virus 40 ("SV40") is used for efficient 

30 polyadenylation. For easy selection of recombinant virus, the plasmid contains the 

beta-galactosidase gene from E. coli under control of a weak Drosophila promoter in the 
same orientation, followed by the polyadenylation signal of the polyhedrin gene. The 
inserted genes are flanked on both sides by viral sequences for cell-mediated 
homologous recombination with wild-type viral DNA to generate a viable virus that 

35 express the cloned polynucleotide. 
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Many other baculovinis vectors can be used in place of the vector above, such 
as pAc373, pVL941, and pAcIMl, as one skiUed in the art would readily appreciate, as 
long as the construct provides appropriately located signals for transcription, 
translation, secretion and the like, including a signal peptide and an in-frame AUG as 
5 required. Such vectors are described, for instance, in Luckow et al.. Virology 170:31- 
39 (1989). 

Specifically, the cDNA sequence contained in the deposited clone, including the 
AUG initiation codon and the naturally associated leader sequence identified in Table 1, 
is amplified using the PGR protocol described in Example 1 . If the naturally occurring 

10 signal sequence is used to produce the secreted protein, the pA2 vector does not need a 
second signal peptide. Alternatively, the vector can be modified (pA2 GP) to include a 
baculovinis leader sequence, using the standard methods described in Summers et al., 
"A Manual of Methods for Baculovinis Vectors and Insect Cell Culture Procedures," 
Texas Agricultural Experimental Station Bulletin No. 1555 (1987). 

1 5 The amplified fragment is isolated from a 1 % agarose gel using a conmiercially 

available kit ("Geneclean," BIO 101 Inc., La Jolla, Ca.). The fragment then is digested 
with ^propriate restriction enzymes and again purified on a 1% agarose gel. 

The plasmid is digested with the corresponding restriction enzymes and 
optionally, can be d^hosphorylated using calf intestinal phosphatase, using routine 

20 procedures known in the art. The DNA is then isolated from a 1 % agarose gel using a 
conunercially available kit ("Geneclean" BIO 101 Inc., La Jolla, Ca.). 

The fragment and the dephosphorylated plasmid are ligated together with T4 
DNA ligase. £ coli HBlOl or other suitable £ coli hosts such as XL-1 Blue 
(Suratagene Cloning Systems, La Jolla, CA) cells are transformed with the ligation 

25 mixture and spread on culture plates. Bacteria containing the plasmid are identified by 
digesting DNA from individual colonies and analyzing the digestion product by gel 
electrophoresis. The sequence of the cloned fragment is confirmed by DNA 
sequencing. 

Five ^ig of a plasmid containing the polynucleotide is co-transfected with 1 .0 ^g 
30 of a commercially available linearized baculovinis DNA ("BaculoGold™ baculovinis 
DNA", Pharmingen, San Diego, CA), using the lipofection metfiod described by 
Feigner et al., Proc. Natl. Acad. Sci. USA 84:7413-7417 (1987). One \ig of 
BaculoGold™ virus DNA and 5 jig of the plasmid are mixed in a sterile well of a 
microtiter plate containing 50 \jd of serum-free Grace's medium (Life Technologies 
35 Inc., Gaithersburg, MD). Afterwards, 10 pj Lipofectin plus 90 pJ Grace's medium are 
added, mixed and mcubated for 15 minutes at room temperature. Then the transfection 
mixture is added drop-wise to Sf9 insect cells (ATCC CRL 171 1) seeded m a 35 mm 
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tissue culture plate with 1 ml Grace's medium without serum. The plate is then 
incubated for 5 hours at 2T C. The transfection solution is then removed from the plate 
and 1 ml of Grace's insect medium supplemented with 10% fetal calf serum is added. 
Cultivation is then continued at 27^ C for four days. 
5 After four days the supernatant is collected and a plaque assay is performed, as 

described by Summers and Smith, supra. An agarose gel with ''Blue Gal" (Life 
Technologies Inc., Gaithersburg) is used to allow easy identification and isolation of 
gal-expressing clones, which produce blue-stained plaques. (A detailed description of a 
"plaque assay" of this type can also be found in the user's guide for insect cell culture 

10 and baculovirology distributed by Life Technologies Inc., Gaithersburg, page 9-10.) 
After appropriate incubation, blue stained plaques are picked with the tip of a 
micropipettor (e.g., Eppendorf). The agar containing the recombinant viruses is then 
resuspended in a microcentriftige tube containing 2(X) ^il of Grace's mediiun and the 
suspension containing the recombinant baculovirus is used to infect Sf9 cells seeded in 

15 35 nun dishes. Four days later the supematants of these culture dishes are harvested 
and then they are stored at 4** C. 

To verify the expression of the polypeptide, Sf9 cells are grown in Grace's 

medium supplemented with 10% heat-inactivated FBS. The cells are infected with the 

recombinant baculovirus containing the polynucleotide at a multiplicity of infection 

20 ("MOI") of about 2. If radiolabeled proteins are desired, 6 hours later the medium is 
removed and is replaced with SF900 II medium minus methionine and cysteine 
(available from Life Technologies Inc., Rockville, MD). After 42 hours, 5 jiCi of ^^S- 
methionine and 5 |ACi ^^S-cysteine (available from Amersham) are added. The cells are 
fiirther incubated for 1 6 hours and then are harvested by centrifiigation. The proteins 

25 in the supernatant as well as the intracellular proteins are analyzed by SDS-PAGE 
followed by autoradiography (if radiolabeled). 

Microsequencing of the amino acid sequence of the amino terminus of purified 
protein may be used to determine the amino terminal sequence of the produced 
protein. 



30 



Example 8: Expression of a Polypeptide in Mammalian Ceils 

The polypeptide of the present invention can be expressed in a mammalian ceU. 
A typical mammalian expression vector contains a promoter element, which mediates 
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the initiation of transcription of mRNA, a protein coding sequence, and signals required 
for the termination of transcription and polyadenylation of the transcript. Additional 
elements include enhancers, Kozak sequences and intervening sequences flanked by 
donor and acceptor sites for RNA splicing. Highly efficient transcription is achieved 
5 with the early and late promoters from SV40, the long terminal repeats (LTRs) ftom 
Retroviruses, e.g., RSV, HTLVI, HIVI and the early promoter of the cytomegalovirus 
(CMV). However, cellular elements can also be used (e.g., the human actin promoter). 

Suitable expression vectors for use in practicing the present invention include, 
for example, vectors such as pSVL and pMSG (Pharmacia, Uppsala, Sweden), 
10 pRSVcat (ATCC 37152), pSV2dhfr (ATCC 37146), pBC12MI (ATCC 67109), 
pCMVSport 2.0, and pCMVSport 3.0. Mammalian host cells that could be used 
include, human Hela, 293, H9 and Jurkat cells, mouse NIH3T3 and C127 cells, Cos 1, 
Cos 7 and CVl, quail QCl-3 cells, mouse L cells and Chinese hamster ovary (CHO) 
cells. 

15 Alternatively, the polypeptide can be expressed in stable cell lines containing the 

polynucleotide integrated into a chromosome. The co-transfection with a selectable 
marker such as dhfr, gpt, neomycin, hygromycin allows the identification and isolation 
of the transfected cells. 

The transfected gene can also be amplified to express large amounts of the 

20 encoded protein. The DHFR (dihydrofolate reductase) marker is useful in developing 
cell lines that carry several hundred or even several thousand copies of the gene of 
interest. (See, e.g., Alt, F. W., et al., J. Biol. Chem. 253:1357-1370 (1978); Hamlin, 
J. L. and Ma, C, Biochem. et Biophys. Acta, 1097:107-143 (1990); Page, M. J. and 
Sydenham, M. A., Biotechnology 9:64-68 (1991).) Another usefiil selection marker is 

25 the enzyme glutamine synthase (GS) (Murphy et al., Biochem J. 227:277-279 ( 1 99 1 ); 
Bebbington et al., Bio/Technology 10:169-175 (1992). Using these maricers, the 
mammalian cells are grown in selective niediimi and the cells with the highest resistance 
are selected. These cell lines contain the amplified gene(s) integrated into a 
chromosome. CZhinese hamster ovary (CHO) and NSO cells are often used for the 

30 production of proteins. 

Derivatives of the plasmid pSV2-dhfr (ATCC Accession No. 37146), the 
expression vectors pC4 (ATCC Accession No. 209646) and pC6 (ATCC Accession 
No.209647) contain the strong promoter (LTR) of the Rous Sarcoma Virus (CuUen et 
al.. Molecular and Cellular Biology, 438-447 (March, 1985)) plus a firagment of the 

35 CMV-enhancer (Boshart et al.. Cell 41:521-530 (1985).) Multiple cloning sites, e.g., 
with the restriction enzyme cleavage sites BamHI, Xbal and Asp718, facilitate the 
cloning of the gene of interest The vectors also contain the 3' intron, the 
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polyadenylation and termination signal of the rat preproinsulin gene, and the mouse 
DHFR gene under control of the S V40 early promoter. 

Specifically, the plasmid pC6, for example, is digested with appropriate 
restriction enzymes and then dephosphorylated using calf intestinal phosphates by 
5 procedures known in the art. The vector is then isolated from a 1 % agarose geL 

A polynucleotide of the present invention is amplified according to the protocol 
outlined in Example 1 . If the naturally occurring signal sequence is used to produce the 
secreted protein, the vector does not need a second signal peptide. Alternatively, if the 
naturally occurring signal sequence is not used, the vector can be modified to include a 

10 heterologous signal sequence. (See, e.g., WO 96/34891.) 

The amplified fragment is isolated from a 1% agarose gel using a commercially 
available kit ("Geneclean," BIO 101 Inc., La Jolla, Ca.). The fragment then is digested 
with appropriate restriction enzymes and again purified on a 1% agarose gel. 

The amplified fragment is then digested with the same restriction enzyme and 

15 purified on a 1% agarose gel. The isolated fragment and the dephosphorylated vector 
are then ligated with T4 DNA ligase. £. coli HB 101 or XL-1 Blue cells are then 
transformed and bacteria are identified that contain the fragment inserted into plasmid 
pC6 using, for instance, restriction enzyme analysis. 

Chinese hamst^ ovary cells lacking an active DHFR gene is used fcH- 

20 transfection. Five ^ig of the expression plasmid pC6 is cotransfected with 0.5 ^.g of the 
plasmid pSVneo using Upofectin (Feigner et al., supra). The plasmid pSV2-neo 
contains a dominant selectable maiker, the neo gene from Tn5 encoding an enzyme that 
confers resistance to a group of antibiotics including G418. The cells are seeded in 
alpha minus MEM supplemented with 1 mg/ml G418. After 2 days, the cells are 

25 trypsinized and seeded in hybridoma cloning plates (Greiner, Germany) in alpha minus 
MEM supplemented with 10, 25, or 50 ng/ml of metothrexate plus 1 mg/ml G418. 
After about 10-14 days single clones are trypsinized and then seeded in 6-well petri 
dishes or 10 ml flasks using different concentrations of methotrexate (50 nM, 100 nM, 
2(K) nM, 4(X) nM, 8(X) nM). Clones growing at the highest concentrations of 

30 methotrexate are then transferred to new 6-well plates containing even higher 

concentrations of methotrexate (1 pM, 2 jiM, 5 jiM, 10 mM, 20 mM). The same 
procedure is repeated until clones are obtained which grow at a concentration of 100 - 
200 jiM. Expression of the desired gene product is analyzed, for instance, by SDS- 
PAGE and Westem blot or by reversed phase HPLC analysis. 



35 
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Example 9: Protein Fusions 

The polypeptides of the present invention are preferably fused to other proteins. 
These fusion proteins can be used for a variety of applications. For example, fusion of 
the present polypeptides to His-tag, HA-tag, protein A, IgG domains, and maltose 
5 binding protein facilitates purification. (See Example 5; see also EP A 394,827; 

Traunecker, et al.. Nature 331:84-86 (1988).) Similarly, fusion to IgG-t, IgG-3, and 
albumin increases the halflife time in vivo. Nuclear localization signals fused to the 
polypeptides of the present invention can target the protein to a specific subcellular 
localization, while covalent heterodimer or homodimers can increase or decrease the 

10 activity of a fusion protein. Fusion proteins can also create chimeric molecules having 
more than one function. Finally, fusion proteins can increase solubility and/or stabihty 
of the fused protein compared to the non-fused protein. All of the types of fusion 
proteins described above can be made by modifying the following protocol, which 
outlines the fusion of a polypeptide to an IgG molecule, or the protocol described in 

15 Examples. 

Briefly, the human Fc portion of the IgG molecule can be PGR amplified, using 
primers that span the 5* and 3' ends of the sequence described below. These primers 
also should have convenient restriction enzyme sites that will facilitate cloning into an 
expression vector, preferably a mammalian expression vector. 

20 For example, if pC4 (Accession No. 209646) is used, the human Fc portion can 

be ligated into the BamHI cloning site. Note that the 3' BamHI site should be 
destroyed. Next, the vector containing the human Fc portion is re-restricted with 
BamHI, linearizing the vector, and a polynucleotide of the present invention, isolated 
by the PGR protocol described in Example 1 , is ligated into this BamHI site. Note that 

25 the polynucleotide is cloned without a stop codon, otherwise a fusion protein will not 
be produced. 

If the naturally occurring signal sequence is used to produce the secreted 
protein, pC4 does not need a second signal peptide. Alternatively, if the naturally 
occurring signal sequence is not used, the vector can be modified to include a 
30 heterologous signal sequence. (See, e.g., WO 96/3489 1 .) 



Human IgG Fc region: 

GGGATCCGGAGCCCAAATCrrCTGACAAAACTCACACATGCCCACCGTGCC 
CAGCACCTGAATrCGAGGGTGCACCGTCAGTCTTCCrCTTCCCCCCAAAACC 
35 CAAGGACACCCrCATGATCrCCCGGACrCCTGAGGTCACATGCGTGGTGGT 
GGACGTAAGCCACGAAGACCCTGAGGTCAAGTTCAACTGGTACGTGGACG 
GCGTGGAGGTGCATAATGCCAAGACAAAGCCGCGGGAGGAGCAGTACAAC 
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AGCACGTACCGTGTGGTCAGCGTCCTCACCGTCCTGCACCAGGACTGGCrG 
AATGGCAAGGAGTACAAGTGCAAGGTCTCCAACAAAGCCCrCCCAACCCCC 
ATCGAGAAAACCATCrCCAAAGCCAAAGGGCAGCCCCGAGAACCACAGGT 
GTACACCCTGCCCCCATCCCGGGATGAGCTGACCAAGAACCAGGTCAGCCr 
5 GACCTGCCTGGTCAAAGGCnTCTATCCAAGCGACATCGCCGTGGAGTG^ 
GAGCAATGGGCAGCCGGAGAACAACTACAAGACCACGCCTCCCGTGCTGG 
ACTCCGACGGCrCCrrcrrCCTCTACAGCAAGCTCACCGTGGACAAGAGCA 
GGTGGCAGCAGGGGAACGTCITCTCATGCTCCGTGATGCATGAGGCTCTGC 
ACAACCACTACACGCAGAAGAGCCTCrCCCrGTCTCCGGGTAAATGAGTGC 
10 GACGGCCGCGACTCTAGAGGAT (SEQIDNO:!) 

Example 10: Production of an Antibody from a Polypeptide 

The antibodies of the present invention can be prepared by a variety of niethods. 
(See, Current Protocols, Chapter 2.) For example, cells expressing a polypeptide of 

1 5 the present invention is administered to an animal to induce the production of sera 
containing polyclonal antibodies. In a preferred method, a preparation of the secreted 
protein is prepared and purified to render it substantially free of natural contaminants. 
Such a preparation is then introduced into an animal in order to produce polyclonal 
antisera of greater specific activity. 

20 In the most preferred method, the antibodies of the present invention are 

monoclonal antibodies (or protein binding fragments thereof). Such monoclonal 
antibodies can be prepared using hybridoma technology. (Kohler et al.. Nature 
256:495 (1975); Kohler et al., Eur. J. Inomunol. 6:51 1 (1976); Kohler et al., Eur. J. 
InmiunoL 6:292 (1976); Hanmieriing et al., in: Monoclonal Antibodies and T-Cell 

25 Hybridomas, Elsevier, N.Y., pp. 563-681 (1981).) In general, such procedures 
involve immunizing an animal (preferably a mouse) with polypeptide or, more 
preferably, witii a secreted polypeptide-expressing cell. Such cells may be cultured in 
any suitable tissue culture medium; however, it is preferable to culture cells in Earle's 
modified Eagle's medium supplemented with 10% fetal bovine serum (inactivated at 

30 about 56°C), and supplemented witii about 10 g/l of nonessential amino acids, about 

1,000 U/ml of peniciUin, and about 100 ^g/ml of streptomycin. 

The splenocytes of such mice are extracted and fiised with a suitable myeloma 

cell line. Any suitable myeloma ceU line may be employed in accordance with the 

present invention; however, it is preferable to employ the parent myeloma cell line 
35 (SP20), available from the ATCC. After fusion, the resulting hybridoma cells are 

selectively maintained in HAT medium, and then cloned by limiting dilution as 
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described by Wands et al. (Gastroenterology 80:225-232 (1981).) The hybridoma cells 
obtained through such a selection are then assayed to identify clones which secrete 
antibodies capable of binding the polypeptide. 

Alternatively, additional antibodies capable of binding to the polypeptide can be 
5 produced in a two-step procedure using anti-idiotypic antibodies. Such a method 
makes use of the fact that antibodies are themselves antigens, and therefore, it is 
possible to obtain an antibody which binds to a second antibody. In accordance with 
this method, protein specific antibodies are used to inmiunize an animal, preferably a 
mouse. The splenocytes of such an animal are then used to produce hybridoma cells, 
10 and the hybridoma cells are screened to identify clones which produce an antibody 

whose ability to bind to the protein-specific antibody can be blocked by the polypeptide. 
Such antibodies comprise anti-idiotypic antibodies to the protein-specific antibody and 
can be used to immunize an animal to induce formation of further protein-specific 
antibodies. 

15 It will be appreciated that Fab and F(ab*)2 and other firagments of the antibodies 

of the present invention may be used according to the methods disclosed herein. Such 
fragments are typically produced by proteolytic cleavage, using enzymes such as papain 
(to produce Fab fragments) or pepsin (to produce F(ab')2 fragments). Alternatively, 
secreted protein-binding firagments can be produced through the ^plication of 

20 recombinant DNA technology or through synthetic chemistry. 

For in vivo use of antibodies in humans, it may be preferable to use 
"humanized" chimeric monoclonal antibodies. Such antibodies can be produced using 
genetic constructs derived from hybridoma cells producing the monoclonal antibodies 
described above. Methods for producing chimeric antibodies are known in the art. 

25 (See, for review, Morrison, Science 229: 1202 (1985); Oi et al., BioTechniques 4:214 
(1986); Cabilly et al., U.S. Patent No, 4,816,567; Taniguchi et al., EP 171496; 
Morrison et al., EP 173494; Neuberger et al., WO 8601533; Robinson et al., WO 
8702671; Boulianne et al.. Nature 312:643 (1984); Neuberger et al.. Nature 314:268 
(1985).) 

30 

Example 11: Production Of Secreted Protein For High-Throughput 
Screening Assays 

The following protocol produces a supernatant containing a polypeptide to be 
tested. This supernatant can then be used in the Screening Assays described in 
35 Examples 13-20. 

First, dilute Poly-D-Lysine (644 587 Boehringer-Mannheim) stock solution 
(Img/ml in PBS) 1:20 in PBS (w/o calcium or magnesium 17-516F Biowhittaker) for a 
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working solution of 50ug/ml. Add 200 ul of this solution to each well (24 well plates) 
and incubate at RT for 20 minutes. Be sure to distribute the solution over each well 
(note: a 12-channeI pipetter may be used with tips on every other channel). Aspirate off 
the Poly-D-Lysine solution and rinse with 1ml PBS (Phosphate Buffered Saline). The 
5 PBS should remain in the well until just prior to plating the cells and plates may be 
poly-lysine coated in advance for up to two weeks. 

Plate 293T ceUs (do not carry cells past P+20) at 2 x 10^ cells/weU in .5ml 
DMEM(Dulbecco's Modified Eagle Medium)(with 4.5 G/L glucose and L-glutamine 
(12-604F Biowhittaker))/10% heat inactivated FBS(14-503F Biowhittaker)/lx 

10 PensU-ep(17-602EBiowhittaker). Let the cells grow overnight. 

The next day, mix together in a sterile solution basin: 300 ul Lipofectamine 
(18324-012 Gibco/BRL) and 5ml Optimem I (31985070 Gibco«RL)/96-weU plate. 
With a small volume multi-chaimel pipetter, aliquot approximately 2ug of an expression 
vector containing a polynucleotide insert, produced by the methods described in 

15 Examples 8 or 9, into an appropriately labeled 96-well round bottom plate. With a 
multi-channel pipetter, add 50ul of the Lipofectamine/Optimem I mixture to each well. 
Pipette up and down gently to mix. Incubate at RT 15-45 minutes. After about 20 
minutes, use a multi-channel pipetter to add ISOul Optimem I to each well. As a 
control, one plate of vector DNA lacking an insert should be transfected with each set of 

20 transfections. 

Preferably, the transfection should be performed by tag-teaming the following 
tasks. By tag-teaming, hands on time is cut in half, and the cells do not spend too 
much time on PBS. First, person A aspirates off the media from four 24-well plates of 
cells, and then person B rinses each well with .5-lml PBS. Person A then aspirates off 

25 PBS rinse, and person B, using al2-channel pipetter with tips on every other channel, 
adds the 200ul of DNA/Upofectamine/Optimem I complex to the odd wells first, then to 
the even wells, to each row on the 24-well plates. Incubate at 37X for 6 hours. 

While cells arc incubating, prepare appropriate media, either 1 %BS A in DMEM 
with Ix penstrep, or CHO-5 media (1 16.6 mg/L of CaC12 (anhyd); 0.00130 mg^ 

30 CuS0^-5Hp; 0.050 mg/L of Fe(N03)3-9H20; 0.417 mg/L of FeS04-7H20; 3 1 1 .80 
mg/L of Kcl; 28.64 mg/L of MgCl^; 48.84 mg/L of MgSO^; 6995,50 mg/L of NaCl; 
2400.0 mg/L of NaHCO,; 62.50 mg/L of NaH^PO^-H^O; 71.02 m^/L of Na2HP04; 
.4320 mg/L of ZnS04-7H20; .002 mg/L of Arachidonic Acid ; 1 .022 mg/L of 
Cholesterol; .070 mg/L of DL-alpha-Tocopherol-Acetate; 0.0520 mg/L of Linoleic 

35 Acid; 0.010 mg/L of Linolenic Acid; 0.010 mg/L of Myristic Acid; 0.010 mg/L of Oleic 
Acid; 0.010 mg/L of Pahnitric Acid; 0.010 mg/L of PaUnitic Acid; 100 mg/L of 
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Pluronic F-68; 0.010 mg/L of Stearic Acid; 2.20 mg/L of Tween 80; 4551 mg/L of D- 
Glucose; 130.85 mg/ml of L- Alanine; 147.50 mg/ml of L-Arginine-HCL; 7.50 mg/ml 
of L-Asparagine-H^O; 6,65 mg/nil of L-Aspaitic Acid; 29.56 mg/ml of L-Cystine- 
2HCL-H2O; 31.29 mg/ml of L-Cystine-2HCL; 7.35 mg/ml of L-GIutamic Acid; 365.0 
5 mg/ml of L-Glutamine; 18.75 mg/ml of Glycine; 52.48 mg/ml of L-Histidine-HCL- 
HjO; 106.97 mg/ml of L-Isoleucine; 1 1 1.45 mg/ml of L-Leucine; 163.75 mg/ml of L- 
Lysine HCL; 32.34 mg/ml of L-Methionine; 68.48 mg/ml of L-Phenylalainine; 40.0 
mg/ml of L-Proline; 26.25 mg/ml of L-Serine; 101.05 mg/ml of L-Threonine; 19.22 
mg/ml of L-Tryptophan; 91.79 mg/ml of L-Tryrosine-2Na-2H20; 99.65 mg/ml of L- 

10 Valine; 0.0035 mg/L of Biotin; 3.24 mg/L of D-Ca Pantothenate; 1 1 .78 mg/L of 
Choline Chloride; 4.65 mg/L of Folic Acid; 15.60 mg/L of i-Inositol; 3.02 mg/L of 
Niacinamide; 3.00 mg/L of Pyridoxal HCL; 0.03 1 mg/L of Pyridoxine HCL; 0.3 19 
mg/L of Riboflavin; 3.17 mg/L of Thiamine HCL; 0.365 mg/L of Thymidine; and 
0.680 mg/L of Vitamin 6,3; 25 mM of HEPES Buffer; 2.39 mg/L of Na Hypoxanthine; 

15 0. 105 mg/L of Lipoic Acid; 0.08 1 mg/L of Sodium Putrescine-2HCL; 55,0 mg/L of 
Sodium Pyruvate; 0.0067 mg/L of Sodium Selenite; 20uM of Ethanolamine; 0.122 
mg/L of Ferric Citrate; 41.70 mg/L of Methyl-B-Cyclodextrin complexed with Linoleic 
Acid; 33.33 mg/L of Methyl-B-Cyclodextrin complexed with Oleic Acid; and 10 mg/L 
of Methyl-B-Cyclodextrin complexed with Retinal) with 2nam ghxtamine and Ix 

20 penstrep. (BSA (81-068-3 Bayer) lOOgm dissolved in IL DMEM for a 10% BSA stock 
solution). Filter the media and collect 50 ul for endotoxin assay in 15ml polystyrene 
conical. 

The transfection reaction is terminated, preferably by tag-teaming, at the end of 
the incubation period. Person A aspirates off the transfection media, while person B 

25 adds 1 .5ml appropriate media to each well. Incubate at 37°C for 45 or 72 hours 

depending on the media used: 1 %BS A for 45 hours or CHO-5 for 72 hours. 

On day four, using a 300ul multichannel pipetter, aliquot 600ul in one 1ml deep 
well plate and the remaining supernatant into a 2ml deep well. The supematants from 
each well can then be used in the assays described in Examples 13-20. 

30 It is specifically understood that when activity is obtained in any of the assays 

described below using a supernatant, the activity originates from either the polypeptide 
directly (e.g., as a secreted protein) or by the polypeptide inducing expression of other 
proteins, which are then secreted into the supernatant. Thus, the invention fiirther 
provides a method of identifying the protein in the supernatant characterized by an 

35 activity in a particular assay. 
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Example 12: Constructi n of GAS Reporter Construct 

One signal transduction pathway involved in the differentiation and proliferation 
of cells is called the Jaks-STATs pathway. Activated proteins in the Jaks-STATs 
pathway bind to gamma activation site "GAS" elements or interferon-sensitive 
5 responsive element ("ISRE"), located in the promoter of many genes. The binding of a 
protein to these elements alter the expression of the associated gene. 

GAS and ISRE elements are recognized by a class of transcription factors called 
Signal Transducers and Activators of Transcription, or "STATs." There are six 
members of the STATs family, Statl and Stat3 are present in many cell types, as is 

10 Stat2 (as response to IFN-alpha is widespread). Stat4 is more restricted and is not in 
many ceU types though it has been found in T helper class I, cells after treatment with 
IL-12. StatS was originally called manmiary growth factor, but has been found at 
higher concentrations in other cells including myeloid cells. It can be activated in tissue 
culture cells by many cytokines. 

15 The STATs are activated to translocate from the cytoplasm to the nucleus upon 

tyrosine phosphorylation by a set of kinases known as the Janus Kinase ("Jaks") 
family. Jaks represent a distinct family of soluble tyrosine kinases and include Tyk2, 
Jakl, Jak2, and Jak3. These kinases display significant sequence similarity and are 
generally catalytically inactive in resting cells. 

20 The Jaks are activated by a wide range of receptors summarized in the Table 

below. (Adapted from review by Schidler and Damell, Ann. Rev. Biochem. 64:62 1-5 1 
(1995).) A cytokine receptor family, capable of activating Jaks, is divided into two 
groups: (a) Qass 1 includes receptors for IL-2, IL-3, IL-4» IL-6, IL-7, IL-9, IL-1 1, IL- 
12, IL-15, Epo, PRL, GH, G-CSF, GM-CSF, LIF, CNTF, and tiirombopoietin; and 

25 (b) Class 2 includes IFN-a, IFN-g, and IL-10. The Class 1 receptors share a 

conserved cysteine motif (a set of four conserved cysteines and one tryptophan) and a 
WSXWS motif (a membrane proxial region encoding Trp-Ser-Xxx-Trp-Ser (SEQ ID 
NO:2)). 

Thus, on binding of a ligand to a receptor, Jaks are activated, which in turn 
30 activate STATs, which then translocate and bind to GAS elements. This entire process 
is encompassed in the Jaks-STATs signal transduction pathway. 

Therefore, activation of the Jaks-STATs pathway, reflected by the binding of 
the GAS or the ISRE element, can be used to indicate proteins involved in the 
proliferation and differentiation of cells. For example, growth factors and cytokines are 
35 known to activate the Jaks-STATs pathway. (See Table below.) Thus, by using GAS 
elements linked to reporter molecules, activators of the Jaks-STATs pathway can be 
identified. 
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10 



15 



20 



25 



30 



35 



Ligand 

IFN family 
IFN-a/B 
IFN-g 
U-10 

pp 130 family 

IL-6 (Pleiotrohic) 

U-ll(Pleiotrohic) 

OnM(Pleiotrohic) 

LIF(Pleiotrohic) 

CNTF(Pleiotrohic) 

G-CSF(Pleiotrohic) 

IL-12(Pleiotrohic) 



JAKs 

tyk2 Jaki Jak2 Jak3 



+ 

? 

? 
? 

-/+ 

7 



g-C family 
IL-2 (lymphocytes) 
IL-4 (lymph/myeloid) - 
IL-7 (lymphocytes) 
IL-9 (lymphocytes) 
IL- 1 3 (lymphocyte) 
IL-15 ? 

gpl40 family 
IL-3 (myeloid) 
IL-5 (myeloid) 
GM-CSF (myeloid) - 

Growth honnone family 
GH ? 
PRL ? 
EPO ? 



+ 
+ 

9 



+ 
+ 
+ 
+ 
+ 
+ 



+/- 



40 



Receptor Tyrosine Kinases 

EGF ? + 

PDGF ? + 

CSF-1 ? + 



+ 

9 



+ 

9 

+ 

+ 

9 

+ 



+ 
+ 
+ 



+ 
+ 



+ 
+ 
+ 



+ 
+ 

+ 

9 



STATS 



1,2,3 
1 

1,3 



1,3 
1,3 
1,3 
1,3 
1,3 
1,3 
1,3 



1,3,5 

6 

5 

5 

6 

5 



5 

1,3,5 
5 



1,3 
1,3 
1,3 



GASfelements) or ISRE 



ISRE 

GAS aRFl>Lys6>IFP) 



GAS (IRFl>Lys6>IFP) 



GAS 

GAS (IRFl = IFP »Ly6)agH) 

GAS 

GAS 

GAS 

GAS 



GAS (IRFl>IFP»Ly6) 

GAS 

GAS 



GAS(B-CAS>IRF1 =IFP»Ly6) 

GAS (IRFl) 
GAS (not IRFl) 
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To construct a synthetic GAS containing promoter element, which is used in the 
Biological Assays described in Examples 13-14, a PGR based strategy is employed to 
generate a GAS-SV40 promoter sequence. The 5' primer contains four tandem copies 
of the GAS binding site found in the IRFl promoter and previously demonstrated to 
5 bind STATs upon induction with a range of cytokines (Rothman et al., Inmiunity 

1:457-468 (1994).), although other GAS or ISRE elements can be used instead. The 5' 
primer also contains 1 8bp of sequence complementary to the S V40 early promoter 
sequence and is flanked with an Xhol site. The sequence of the 5' primer is: 
S^GCGCCTCGAGATTTCCCCGAAATCTAGATTTCCCCGAAATGATTTCCCCG 
10 AAATGATrrCCCCGAAATATCrGCCATCTCAATTAG:3' (SEQIDN0:3) 

The downstream primer is complementary to the S V40 promoter and is flanked 
with a Hind IE site: 5^GCGGCAAGCTTTTTGCAAAGCCTAGGC:3' (SEQ ID 
NO:4) 

PGR amplification is performed using the S V40 promoter template present in 
15 the B-gal:promoter plasmid obtained from Clontech. The resulting PGR fragment is 
digested with Xhol/Hind III and subcloned into BLSK2-. (Stratagene.) Sequencing 
with forward and reverse primers confirms that the insert contains the following 
sequence: 

5^£rCGAGATTrGGGGGAAATGTAGATITCGGGGAAATGATITGGGCGAAATG 

20 ATTTGGGGGAAATATGTGGGATCTGAATTAGTGAGGAAGGATAGTGGGGGGG 
CrAACrGGGGCGATGGGGGGGGTAACrGGGGGGAGTTGGGGGGATTGTGGGG 
GGGATGGCrGACTAATTTTTTTTATTTATGGAGAGGGGGAGGGGGGGTGGGG 
GTGTGAGGTATFGGAGAAGTAGTGAGGAGGGTTrTTTGGAGGGGTAGGC^^ 
TGGAAAAAGCIT:3' (SEQIDNO:5) 

25 With this GAS promoter element linked to the SV40 promoter, a GAS:SEAP2 

reporter construct is next engineered. Here, the reporter molecule is a secreted alkaline 
phosphatase, or "SEAP." Clearly, however, any reporter molecule can be instead of 
SEAP, in this or in any of the other Examples. Well known reporter molecules that can 
be used instead of SEAP include chloramphenicol acetyltransferase (GAT), luciferase, 

30 alkaline phosphatase, B-galactosidase, green fluorescent protein (GFP), or any protein 
detectable by an antibody. 

The above sequence confirmed synthetic GAS-SV40 promoter element is 
subcloned into the pSEAP-Promoter vector obtained from Glontech using Hindlll and 
Xhol, effectively replacing the SV40 promoter with the amplified GAS:SV40 promoter 

35 element, to create the GAS-SEAP vector. However, this vector does not contain a 
neomycin resistance gene, and therefore, is not preferred for manmialian expression 
systems. 
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Thus, in order to generate mammalian stable cell lines expressing the GAS- 
SEAP reporter, the G AS-SEAP cassette is removed from the G AS-SEAP vector using 
Sail and NotI, and inserted into a backbone vector containing the neomycin resistance 
gene, such as pGFP-1 (Clontech), using these restriction sites in the multiple cloning 
5 site, to create the GAS-SEAP/Neo vector. Once this vector is transfected into 

mammalian cells, this vector can then be used as a reporter molecule for GAS binding 
as described in Examples 13-14. 

Other constructs can be made using the above description and replacing GAS 
with a different promoter sequence. For example, construction of reporter molecules 

10 containing NFK-B and EGR promoter sequences are described in Examples 15 and 16. 
However, many other promoters can be substituted using the protocols described in 
these Examples. For instance, SRE, IL-2, NFAT, or Osteocalcin promoters can be 
substituted, alone or in combination (e.g., GAS/NF-KB/EGR, GAS/NF-KB, II- 
2/NFAT, or NF-KB/GAS). Similarly, other cell lines can be used to test reporter 

1 5 constract activity, such as HELA (epithelial), HU VEC (endothehal), Reh (B-cell), 
Saos-2 (osteoblast), HUVAC (aortic), or Cardiomyocyte. 

Example 13: High-Throughput Screening Assay for T-cell Activity. 

The following protocol is used to assess T-cell activity by identiiying factors, 

20 such as growth factors and cytokines, that may proliferate or differentiate T-cells. T- 
cell activity is assessed using the GAS/SEAP/Neo construct produced in Example 12. 
Thus, factors that increase SEAP activity indicate the abUity to activate the Jaks-STATS 
signal transduction pathway. The T-cell used in this assay is Jurkat T-cells (ATCC 
Accession No. TIB- 152), although Molt-3 cells (ATCC Accession No. CRL-1552) and 

25 Molt-4 cells (ATCC Accession No. CRL- 1 582) ceUs can also be used. 

Juikat T-cells are lymphoblastic CD4+ Thl helper cells. In order to generate 
stable cell lines, approximately 2 million Jurkat cells are transfected with the GAS- 
SEAP/neo vector using DMRIE-C (Life Technologies)(transfection procedure 
described below). The transfected ceUs are seeded to a density of approximately 

30 20,000 cells per well and transfectants resistant to 1 mg/ml genticin selected. Resistant 
colonies are expanded and then tested for their response to increasing concentrations of 
interferon ganuna. The dose response of a selected clone is demonstrated. 

Specifically, the following protocol will yield sufficient cells for 75 wells 
containing 200 ul of cells. Thus, it is either scaled up, or performed in multiple to 

35 generate sufficient cells for multiple 96 well plates. Juricat cells are maintained in RPMI 
+ 10% serum with l%Pen-Strep. Combine 2.5 mis of OPTI-MEM G-ife Technologies) 
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with 10 ug of plasmid DNA in a T25 flask. Add 2.5 ml OPTI-MEM containing 50 ul 

of DMRIE-C and incubate at room temperature for 15-45 mins. 

During the incubation period, count cell concentration, spin down the required 

number of cells (10^ per transfection), and resuspend in OPTI-MEM to a final 
5 concentration of 10^ cells/ml. Then add 1ml of 1 x 10' cells in OPTI-MEM to T25 flask 

and incubate at 37°C for 6 hrs. After the incubation, add 10 ml of RPMf + 15% serum. 
The Juikat:GAS-SEAP stable reporter lines are maintained in RPMI + 10% 

serum, 1 mg/ml Genticin, and 1% Pen-Strep. These cells are treated with supematants 

containing a polypeptide as produced by the protocol described in Example 11. 
10 On the day of treatment with the supernatant, the cells should be washed and 

resuspended m fresh RPMI + 10% serum to a density of 500,000 cells per ml. The 

exact number of cells required will depend on the number of supematants being 

screened. For one 96 well plate, approximately 10 million cells (for 10 plates, 100 

miUion cells) are required. 
15 Transfer the cells to a triangular reservoir boat, in order to dispense the cells into 

a 96 well dish, using a 12 channel pipette. Using a 12 channel pipette, transfer 200 ul 

of cells into each well (therefore adding 100, 000 cells per weU). 

After all the plates have been seeded, 50 ul of the supematants are transferred 

directly from the 96 well plate containing the supematants into each weU using a 12 
20 channel pipette. In addition, a dose of exogenous interferon ganuna (0.1, 1.0, 10 ng) 

is added to wells H9, HIO, and HI 1 to serve as additional positive controls for the 

assay. 

The 96 well dishes containing Jurkat cells treated with supematants are placed in 
an incubator for 48 hrs (note: this time is variable between 48-72 hrs). 35 ul samples 
25 from each well are then transferred to an opaque 96 weU plate using a 12 channel 

pipette. The opaque plates should be covered (using sellophene covers) and stored at - 
2(PC until SEAP assays are performed according to Example 17. The plates 

containing the remaining treated cells are placed at 4^ and serve as a source of material 
for repeating the assay on a specific well if desired. 
30 As a positive control, 100 Unit/ml interferon gamma can be used which is 

known to activate Jurkat T cells. Over 30 fold induction is typically observed in the 
positive control wells. 
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Example 14: High-Throughput Screening Assay Identifying Myeloid 
Activity 

The following protocol is used to assess myeloid activity by identifying factors, 
such as growth factors and cytokines, that may proliferate or differentiate myeloid cells. 
5 Myeloid cell activity is assessed using the GAS/SEAP/Neo construct produced in 

Example 12. Thus, factors that increase SEAP activity indicate the ability to activate the 
Jaks-STATS signal transduction pathway. The myeloid cell used in this assay is U937, 
a pre-monocyte cell line, although TF-1, HL60, or KG 1 can be used. 

To transiently transfect U937 cells with the GAS/SEAP/Neo construct produced 
10 in Example 12, a DEAE-Dextran method (Kharbanda et. al., 1994, Cell Growth & 
Differentiation, 5:259-265) is used. First, harvest 2xlOe7 U937 cells and wash with 
PBS. The U937 cells are usually grown in RPMI 1640 medium containing 10% heat- 
inactivated fetal bovine serum (FBS) supplemented with 100 units/ml penicillin and 100 
mg/ml streptomycin. 

15 Next, suspend the cells in 1 ml of 20 mM Tris-HCl (pH 7.4) buffer containing 

0.5 mg/ml DEAE-Dextran, 8 ug GAS-SE/U>2 plasmid DNA, 140 mM NaCl, 5 mM 
KCl, 375 uM Na2HP04.7H20, 1 mM MgCl2, and 675 uM CaCl2. Incubate at 37^0 
for 45 min. 

Wash the cells with RPMI 1640 medium containing 10% FBS and then 
20 resuspend in 10 ml complete medium and incubate at 37^C for 36 hr. 

The GAS-SEAP/U937 stable cells are obtained by growing the cells in 400 
ug/ml G418. The G418-free medium is used for routine growth but every one to two 
months, the cells should be re-grown in 400 ug/ml G418 for couple of passages. 

These ceUs are tested by harvesting 1x10* cells (this is enough for ten 96-well 
25 plates assay) and wash with PBS. Suspend the cells in 200 ml above described growth 
medium, with a final density of 5x10* cells/ml. Plate 200 ul cells per well in the 96- 
well plate (or 1x10* cells/weU). 

Add 50 ul of the supernatant prepared by the protocol described in Example 1 1 . 
Incubate at 37^C for 48 to 72 hr. As a positive control, 100 Unit/ml interferon gamma 
30 can be used which is known to activate U937 cells. Over 30 fold induction is typically 
observed in the positive control wells. SEAP assay the supernatant according to the 
protocol described in Example 17. 
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Example 15: High-Throughput Screening Assay Identifying Neuronal 
Activity. 

When cells undergo differentiation and proliferation, a group of genes are 
activated through many different signal transduction pathways. One of these genes, 
5 EGRl (early growth response gene 1), is induced in various tissues and cell types upon 
activation. The promoter of EGRl is responsible for such induction. Using the EGRl 
promoter linked to reporter molecules, activation of cells can be assessed. 

Particularly, the following protocol is used to assess neuronal activity in PC 12 
cell lines. PC 12 cells (rat phenochromocytoma cells) are known to proliferate and/or 
10 differentiate by activation with a number of mitogens, such as TPA (tetradecanoyl 

phorbol acetate), NGF (nerve growth factor), and EGF (epidermal growth factor). The 
EGRl gene expression is activated during this treatment. Thus, by stably transfecting 
PC 12 cells with a construct containing an EGR promoter hnked to SEAP reporter, 
activation of PC 12 cells can be assessed. 
1 5 The EGR/SEAP reporter construct can be assembled by the following protocol. 

The EGR-1 promoter sequence (-633 to +l)(Sakamoto K et al.. Oncogene 6:867-871 
(1991)) can be PCR amplified horn human genomic DNA using the following primers: 

5* GCGCTCGAGGGATGACAGCGATAGAACCCCGG -3* (SEQ ID NO:6) 

5* GCGAAGCTTCGCGACrCCCCGGATCCGCCrC-3* (SEQIDNO:7) 
20 Using the GAS:SEAP/Neo vector produced in Example 12, EGRl amplified 

product can then be inserted into this vector. Linearize the GAS:SEAP/Neo vector 
using restriction enzymes Xhol/Hindlll, removing the GAS/SV40 stuffer. Restrict the 
EGRl amplified product with these same enzymes. Ligate the vector and the EGRl 
promoter. 

25 To prepare 96 well-plates for cell culture, two mis of a coating solution ( 1 :30 

dilution of collagen type I (Upstate Biotech Inc. Cat#08-1 15) in 30% ethanol (filter 
sterilized)) is added per one 10 cm plate or 50 ml per well of the 96-well plate, and 
allowed to air dry for 2 hr. 

PC12 cells are routinely grown in RPMI-1640 medium (Bio Whittaker) 

30 containing 10% horse serum (JRH BIOSCIENCES, Cat. # 12449-78P), 5% heat- 
inactivated fetal bovine senmi (PBS) supplemented with 100 units/ml peniciUin and 100 
ug/ml streptomycin on a precoated 10 cm tissue culture dish. One to four split is done 
every three to four days. Cells are removed from the plates by scraping and 
resuspended with pipetting up and down for more than 15 times. 

35 Transfect the EGR/SEAP/Neo construct into PC12 usmg the Lipofectamine 

protocol described in Example 11. EGR-SEAP/PC12 stable cells are obtained by 
growing the cells in 300 ug/ml G418. The G418-free medium is used for routine 
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growth but every one to two months, the cells should be re-grown in 300 ug/ml G418 
for couple of passages* 

To assay for neuronal activity, a 10 cm plate with cells around 70 to 80% 
confluent is screened by removing the old medium. Wash the cells once with PBS 
5 (Phosphate buffered saline). Then starve the cells in low serum medium (RPMI- 1640 
containing 1% horse serum and 0.5% FBS with antibiotics) overnight. 

The next morning, remove die medium and wash the cells with PBS. Scrape 
off the cells from the plate, suspend the cells well in 2 ml low serum medium. Count 

the cell number and add more low serum medium to reach final cell density as 5x10^ 
10 cells/ml. 

Add 200 ul of the cell suspension to each well of 96-well plate (equivalent to 
lxl05 cells/well). Add 50 ul supernatant produced by Example 11, 37^C for 48 to 72 
hr. As a positive control, a growth factor known to activate PC 12 cells through EGR 
can be used, such as 50 ng/ul of Neuronal Growth Factor (NGF). Over fifty-fold 
15 induction of SEAP is typically seen in the positive control wells. SEAP assay the 
supernatant according to Example 17. 

Example 16: High-Throughput Screening Assay for T-cell Activity 

NF-kB (Nuclear Factor kB) is a transcription factor activated by a wide variety 
20 of agents including the inflammatory cytokines IL- 1 and TNF, CD30 and CD40, 
lymphotoxin-alpha and lymphotoxin-beta, by exposure to LPS or thrombin, and by 

expression of certain viral gene products. As a transcription factor, NF-kB regulates 
the expression of genes involved in inmiune cell activation, control of apoptosis (NF- 
kB ^pears to shield cells from apoptosis), B and T-cell development, anti-viral and 
25 antimicrobial responses, and multiple stress responses. 

In non-stimulated conditions, NF- kB is retained in the cytoplasm with I-kB 
(Inhibitor kB). However, upon stimulation, I- kB is phosphorylated and degraded, 
causing NF- kB to shuttle to the nucleus, thereby activating transcription of target 

genes. Target genes activated by NF- xfi include IL-2, IL-6, GM-CSF, ICAM-1 and 
30 class 1 MHC. 

Due to its central role and ability to respond to a range of stimuh, reporter 
constructs utiUzing the NF-kB promoter element are used to screen the supematants 
produced in Example 1 1. Activators or inhibitors of NF-kB would be useful in treating 
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diseases. For example, inhibitors of NF-kB could be used to treat those diseases 
related to the acute or chronic activation of NF-kB, such as rheumatoid arthritis. 

To construct a vector containing the NF-kB promoter element, a PCR based 

strategy is employed. The upstream primer contains four tandem copies of the NF-kB 

5 binding site (GGGGACTTTCCC) (SEQ ID NO:8), 1 8 bp of sequence complementary 
to the 5' end of the SV40 early promoter sequence, and is flanked with an Xhol site: 

S^GCGGCCTCGAGGGGACTTTCCCGGGGACTTTCCGGGGACTTTCCGGGAC 

TTTCCATCCTGCCATCTCAATTAG:3' (SEQIDNO:9) 

The downstream primer is complementary to the 3' end of the SV40 promoter 
1 0 and is flanked with a Hind in site: 

5^GCGGCAAGCTrTTTGCAAAGCCTAGGC:3' (SEQ ID NO:4) 

PCR amplification is performed using the SV40 promoter template present in 

the pB-gal:promoter plasmid obtained from Clontech. The resulting PCR fragment is 

digested with Xhol and Hind DI and subcloned into BLSK2-. (Stratagene) 
1 5 Sequencing with the T7 and T3 primers confirms the insert contains the following 

sequence: 

5':CTCGAGGGGACrTTCCCGGGGACTTTCCGGGGACTTTCCGGGACIT^ 
ATCTGCCATCrCAATTAGTCAGCAACCATAGTCCCGCCCCTAACrCCGCCCA 
20 TCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGCCCCATGGCTGACr 
AATTTTTTTTAirrATGCAGAGGCCGAGGCCGCCTCGGCGrCTGAGCT 
CAGAAGTAGTGAGGAGGCTTTTTTGGAGGCCTAGGCTTTTGCAy^ 
3' (SEQ ID NO: 10) 

25 Next, replace the S V40 minimal promoter element present in the pSEAP2- 

promoter plasmid (Clontech) with this NF-KB/SV40 fragment using Xhol and Hindlll. 

However, this vector does not contain a neomycin resistance gene, and therefore, is not 
preferred for mammalian expression systems. 

In order to generate stable manmialian ceD lines, the NF-KB/SV40/SEAP 

30 cassette is removed from the above NF-kB/SEAP vector using restriction enzymes Sail 
and NotI, and inserted into a vector containing neomycin resistance. Particularly, the 
NF-KB/SV40/SEAP cassette was inserted into pGFP-1 (Clontech), replacing the GFP 
gene, after restricting pGFP-1 with Sail and Notl. 
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Once NfF-KB/S V40/SEAP/Neo vector is created, stable Jurkat T-cells are 

created and maintained according to the protocol described in Example 13. Similarly, 
the method for assaying supematants with these stable Jurkat T-cells is also described 
in Example 13. As a positive control, exogenous TNF alpha (0. 1 , 1 , 1 0 ng) is added to 
5 wells H9, HIO, and HI 1 , with a 5-10 fold activation typically observed. 

Example 17: Assay for SEAP Activity 

As a reporter molecule for the assays described in Examples 13-16, SEAP 
activity is assayed using the Tropix Phospho-light Kit (Cat. BP-400) according to the 
10 following general procedure. The Tropix Phospho-light Kit supplies the Dilution, 
Assay, and Reaction Buffers used below. 

Prime a dispenser with the 2.5x DUution Buffer and dispense 15 \Jd of 2.5x 
dilution buffer into Optiplates containing 35 ^il of a supernatant Seal the plates with a 

plastic sealer and incubate at 65^C for 30 min. Separate the Optiplates to avoid uneven 
15 beating. 

Cool the samples to room temperature for 15 minutes. Empty the dispenser and 

prime with the Assay Buffer. Add 50 \il Assay Buffer and incubate at room 

temperature 5 min. Empty the dispenser and prime with the Reaction Buffer (see the 

table below). Add 50 ^1 Reaction Buffer and incubate at room temperature for 20 

20 minutes. Since the intensity of the chemiluminescent signal is time dependent, and it 
takes about 10 minutes to read 5 plates on luminometer, one should treat 5 plates at each 
time and start the second set 10 minutes later. 

Read the relative light unit in the luminometer. Set H12 as blank, and print the 
results. An increase in chemiluminescence indicates reporter activity. 

25 

Reaction Buffer Formulation: 



#of plates 


Rxn buffer diluent (ml) 


CSPD (ml) 


10 


60 


3 


11 


65 


3.25 


12 


70 


3.5 


13 


75 


3.75 


14 


80 


4 


15 


85 


4.25 


16 


90 


4.5 


17 


95 


4.75 


18 


100 


5 


19 


105 


5.25 


20 


110 


5.5 


21 


115 


5.75 


22 


120 


6 
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Example 18: High-Throughput Screening Assay Identifying Changes in 
Small Molecule Concentration and Membrane Permeability 

Binding of a ligand to a recqptor is known to alter intracellular levels of small 
5 molecules, such as calcium, potassium, sodium, and pH, as well as alter membrane 
potential. These alterations can be measured in an assay to identify supematants which 
bind to receptors of a particular cell. Although the following protocol describes an 
assay for calciiun, this protocol can easily be modified to detect changes in potassium, 
sodium, pH, membrane potential, or any other small molecule which is detectable by a 

10 fluorescent probe. 

The following assay uses Fluorometric Imaging Plate Reader ("FLIPR") to 
measure changes in fluorescent molecules (Molecular Probes) that bind small 
molecules. Clearly, any fluorescent molecule detecting a small molecule can be used 
instead of the calcium fluorescent molecule, fluo-3, used here. 

15 For adherent cells, seed the cells at 10,000 -20,000 cells/well in a Co-star black 

96-well plate with clear bottom. The plate is incubated in a CO2 incubator for 20 hours. 
The adherent cells are washed two times in Biotek washer with 200 ul of HBSS 
(Hank^s Balanced Salt Solution) leaving 100 ul of buffer after the final wash. 
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A stock solution of 1 mg/ml fluo-3 is made in 10% pluronic acid DMSO. To 
load the cells with fluo-3, 50 ul of 1 2 ug/ml fluo-3 is added to each well. The plate is 
incubated at 37°C in a COj incubator for 60 min. The plate is washed foiu* times in the 
Biotek washer with HBSS leaving 100 ul of buffer. 
5 For non-adherent cells, the cells are spun down from culture media. Cells are 

re-suspended to 2-5x10^ cells/ml with HBSS in a 50-ml conical tube. 4 ul of 1 mg/ml 
fluo-3 solution in 10% pluronic acid DMSO is added to each ml of cell suspension. 
The tube is then placed in a YTC water bath for 30-60 min. The cells are washed twice 
with HBSS, resuspended to 1x10* ceUs/ml, and dispensed into a microplate, 100 
10 ul/well. The plate is centrifuged at 1000 rpm for 5 min. The plate is then washed once 
in Denley CellWash with 200 ul, followed by an aspiration step to 100 ul final volume. 

For a non-cell based assay, each well contains a fluorescent molecule, such as 
fluo-3. The supernatant is added to the well, and a change in fluorescence is detected. 

To measure the fluorescence of intracellular calcium, the FLIPR is set for the 
15 following parameters: (1) System gain is 300-800 mW; (2) Exposure time is 0.4 

second; (3) Camera F/stop is F/2; (4) Excitation is 488 nm; (5) Emission is 530 nm; and 
(6) Sample addition is 50 ul. Increased emission at 530 nm indicates an extracellular 

signaling event which has resulted in an increase in the intracellular Ca**^ 
concentration. 

20 

Example 19: High-Throughput Screening Assay Identifying Tyrosine 
Kinase Activity 

The Protein Tyrosine Kinases (PTK) represent a diverse group of 
transmembrane and cytoplasmic kinases. Within the Receptor Protein Tyrosine Kinase 

25 RPTK) group are receptors for a range of mitogenic and metabolic growth factors 
including the PDGF, FGF, EGF, NGF, HGF and Insulin receptor subfamilies. In 
addition there are a large family of RPTKs for which the corresponding ligand is 
unknown. Ligands for RPTKs include mainly secreted small proteins, but also 
membrane-bound and extracellular matrix proteins. 

30 Activation of RPTK by ligands involves ligand-mediated receptor dimerization, 

resulting in transphosphorylation of the receptor subunits and activation of the 
cytoplasmic tyrosine kinases. The cytoplasmic tyrosine kinases include receptor 
associated tyrosine kinases of the src-family (e.g., src, yes, Ick, lyn, fyn) and non- 
receptor linked and cytosolic protein tyrosine kinases, such as the Jak family, members 

35 of which mediate signal transduction triggered by the cytokine superfamily of receptors 
(e.g., the Interleukins, Interferons, GM-CSF, and Leptin). 
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Because of the wide range of known factors capable of stimulating tyrosine 
kinase activity, the identification of novel human secreted proteins capable of activating 
tyrosine kinase signal transduction pathways are of interest. Therefore, the following 
protocol is designed to identify those novel human secreted proteins capable of 
5 activating the tyrosine kinase signal transduction pathways. 

Seed target cells (e.g., primary keratinocytes) at a density of approximately 
25,000 cells per well in a 96 well Loprodyne Silent Screen Plates purchased from 
Nalge Nunc (Naperville, IL). The plates arc sterilized with two 30 minute rinses with 
100% ethanol, rinsed with water and dried overnight. Some plates are coated for 2 hr 

10 with 100 ml of cell culturc grade type I collagen (50 mg/ml), gelatin (2%) or polylysine 
(50 mg/ml), all of which can be purchased from Sigma Chemicals (St. Louis, MO) or 
10% Matrigel purchased from Becton Dickinson (Bedford^lA), or calf serum, rinsed 
with PBS and stored at 4^C, CeU growth on these plates is assayed by seeding 5,000 
cells/well in growth medium and indirect quantitation of cell number through use of 

15 alamarBlue as described by the manufacturer Alamar Biosciences, Inc. (Sacramento, 
CA) after 48 hr. Falcon plate covers #3071 from Becton EWckinson (BedfordMA) are 
used to cover the Loprodyne Silent Screen Plates. Falcon Microtest III cell culture 
plates can also be used in some proliferation experiments. 

To prepare extracts* A43 1 cells are seeded onto the nylon membranes of 

20 Loprodyne plates (20,000/200ml/well) and cultured overnight in complete medium. 
Cells are quiesced by incubation in serum-free basal medium for 24 hr. After 5-20 
minutes treatment with EGF (60ng/ml) or 50 ul of the supernatant produced in Example 
1 1, the medium was removed and 100 ml of extraction buffer ((20 mM HEPES pH 
7.5, 0.15 M NaCl, 1% Triton X-100, 0.1% SDS, 2 mM Na3V04, 2 mM Na4P207 

25 and a cocktail of protease inhibitors (# 1836170) obtained bom Boeheringer Mannheim 
(Indianapolis, IN) is added to each well and the plate is shaken on a rotating shaker for 
5 minutes at 4^C. The plate is then placed in a vacuum transfer manifold and the extract 
filtered through the 0.45 nam membrane bottoms of each well using house vacuum. 
Extracts are collected in a 96-well catch/assay plate in the bottom of the vacuum 

30 manifold and inmiediately placed on ice. To obtain extracts clarified by centrifiigation, 
the content of each well, after detergent solubilization for 5 minutes, is removed and 
centrifiiged for 15 minutes at 4^C at 16,000 x g. 

Test the filtered extracts for levels of tyrosine kinase activity. Although many 
methods of detecting tyrosine kinase activity are known, one method is described here. 

35 Generally, the tyrosine kinase activity of a supernatant is evaluated by 

determining its ability to phosphorylate a tyrosine residue on a specific substrate (a 
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biotinylated peptide). Biotinylated peptides that can be used for this purpose include 
PSKl (corresponding to amino acids 6-20 of the cell division kinase cdc2-p34) and 
PSK2 (corresponding to amino acids 1-17 of gastrin). Both peptides are substrates for 
a range of tyrosine kinases and are available from Boehringer Mannheim. 
5 The tyrosine kinase reaction is set up by adding the following components in 

order. First, add lOuI of 5uM Biotinylated Peptide, then lOul ATP/Mg2+ (5mM 
ATP/50mM MgCl2), then lOuI of 5x Assay Buffer (40mM imidazole hydrochloride, 
pH7.3, 40 mM beta-glycerophosphate, ImM EGTA, lOOmM MgCl2, 5 mM MnCl2, 
0.5 mg/ml BSA), then 5ul of Sodium Vanadate(lmM), and then 5ul of water. Mix the 
10 components gently and preincubate the reaction mix at 30<^C for 2 min. Initial the 
reaction by adding lOul of the control enzyme or the filtered supernatant. 

The tyrosine kinase assay reaction is then terminated by adding 10 ul of 120mm 
' EDTA and place the reactions on ice. 

Tyrosine kinase activity is determined by transferring 50 ul aliquot of reaction 
1 5 mixmre to a microtiter plate (MTP) module and incubating at 37^C for 20 min. This 
allows the streptavadin coated 96 well plate to associate with the biotinylated peptide. 
Wash the MTP module with 300ul/well of PBS four times. Next add 75 ul of anti- 
phospotyrosine antibody conjugated to horse radish peroxidase(anti-P-Tyr- 

POD(0.5u/ml)) to each well and incubate at 37<*C for one hour. Wash the weU as 
20 above. 

Next add lOOul of peroxidase substrate solution (Boehringer Mannheim) and 
incubate at room temperature for at least 5 mins (up to 30 min). Measure the 
absorbance of the sample at 405 nm by using ELIS A reader. The level of bound 
peroxidase activity is quantitated using an ELIS A reader and reflects the level of 
25 tyrosine kinase activity. 

Example 20: High-Throughput Screening Assay Identifying 
Phosphorylation Activity 

As a potential alternative and/or compliment to the assay of protein tyrosine 
30 kinase activity described in Example 19, an assay which detects activation 

(phosphorylation) of major intracellular signal transduction intermediates can also be 
used. For example, as described below one particular assay can detect tyrosine 
phosphorylation of the Erk-1 and Erk-2 kinases. However, phosphorylation of other 
molecules, such as Raf, INK, p38 MAP, Map kinase kinase (MEK), MEK kinase, 
35 Src, Muscle specific kinase (MuSK). IRAK, Tec, and Janus, as well as any other 
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phosphoserine, phosphotyrosine, or phosphothreonine molecule, can be detected by 
substituting these molecules for Erk-1 or Erk:-2 in the following assay. 

Specifically, assay plates are made by coating the wells of a 96-well ELIS A 
plate with 0.1ml of protein G (lug/ml) for 2 hr at room temp, (RT). The plates are then 
5 rinsed with PBS and blocked with 3% BSA/PBS for 1 hr at RT. The protein G plates 
are then treated with 2 commercial monoclonal antibodies (lOOngAvell) against Erk-1 
and Erk-2 (1 hr at RT) (Santa Cruz Biotechnology). (To detect other molecules, this 
step can easily be modified by substituting a monoclonal antibody detecting any of the 

above described molecules.) After 3-5 rinses with PBS, the plates are stored at 4^C 
10 until use. 

A43 1 cells are seeded at 20,000/well in a 96-well Loprodyne filterplate and 
cultured overnight in growth medium. The cells arc then starved for 48 hr in basal 
medium (DMEM) and then treated with EGF (6ng/well) or 50 ul of the supematants 
obtained in Example 1 1 for 5-20 minutes. The cells are then solubilized and extracts 

15 filtered directiy into the assay plate. 

After incubation with the extract for 1 hr at RT, the wells are again rinsed. As a 
positive control, a conunercial preparation of MAP kinase ( lOng/well) is used in place 
of A431 extract. Plates are then treated with a commercial polyclonal (rabbit) antibody 
(lug/ml) which specifically recognizes the phosphorylated epitope of the Erk-1 and 

20 Erk-2 kinases (1 hr at RT). This antibody is biotinylated by standard procedures. The 
bound polyclonal antibody is then quantitated by successive incubations with 
Europium-streptavidin and Europium fluorescence enhancing reagent in the Wallac 
DELFIA instrument (time-resolved fluorescence). An increased fluorescent signal over 
background indicates a phosphorylation. 

25 

Example 21: Method of Determining Alterations in a Gene 
Corresponding to a Polynucleotide 

RNA isolated from entire families or individual patients presenting with a 
phenotype of interest (such as a disease) is be isolated. cDNA is then generated fix)m 
30 these RNA samples using protocols known in the art. (See, Sambrook.) ThecDNAis 
then used as a template for PCR, employing primers surrounding regions of interest in 

SEQ ID NO:X. Suggested ?CR conditions consist of 35 cycles at 95°C for 30 

seconds; 60-120 seconds at 52-58°C; and 60-120 seconds at 70°C, using buffer 

solutions described in Sidransky, D.. et al.. Science 252:706 (1991). 
35 PCR products are then sequenced using primers labeled at their 5' end with T4 

polynucleotide kinase, employing SequiTherm Polymerase. (Epicentre Technologies). 
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The intron-exon borders of selected exons is also determined and genomic PCR 
products analyzed to confirm the results. PCR products harboring suspected mutations 
is then cloned and sequenced to validate the results of the direct sequencing. 

PCR products is cloned into T-tailed vectors as described in Holton, T.A. and 
5 Graham, M.W., Nucleic Acids Research, 19: 1 156 (1991) and sequenced with T7 
polymerase (United States Biochemical). Affected individuals are identified by 
mutations not present in unaffected individuals. 

Genomic rearrangements are also observed as a method of determining 
alterations in a gene corresponding to a polynucleotide. Genomic clones isolated 

10 according to Example 2 are nick-translated with digoxigenindeoxy-uridine 5 - 

triphosphate (Boehringer Manheim), and FISH performed as described in Johnson, 
Cg. et al.. Methods Cell Biol. 35:73-99 (1991). Hybridization with the labeled probe is 
carried out using a vast excess of human cot-1 DNA for specific hybridization to the 
corresponding genomic locus. 

15 Chromosomes are counterstained with 4,6-dian]ino-2-phenylidole and 

propidium iodide, producing a combination of C- and R-bands. Aligned images for 
precise mapping are obtained using a triple-band filter set (Chroma Technology, 
Bratdeboro, VT) in combination with a cooled charge-coupled device camera 
(Photometries, Tucson, AZ) and variable excitation wavelength filters. (Johnson, Cv. 

20 et al.. Genet. Anal. Tech. Appl., 8:75 (1991).) Image collection, analysis and 

chromosomal fractional length measurements are performed using the ESee Graphical 
Program System. (Inovision Corporation, Durham, NC.) Chromosome alterations of 
the genomic region hybridized by the probe are identified as insertions, deletions, and 
translocations. These alterations are used as a diagnostic marker for an associated 

25 disease. 

Example 22: Method of Detecting Abnormal Levels of a Polvpeptide In a 
Biological Sample 

A polypeptide of the present invention can be detected in a biological sample, 
30 and if an increased or decreased level of the polypeptide is detected, this polypeptide is 
a marker for a particular phenotype. Methods of detection are numerous, and thus, it is 
understood that one skilled in the art can modify the following assay to fit their 
particular needs. 

For example, antibody-sandwich BUS As are used to detect polypeptides in a 
35 sample, preferably a biological sample. Wells of a microtiter plate are coated with 

specific antibodies, at a final concentration of 0.2 to 10 ug/ml. The antibodies are either 
monoclonal or polyclonal and are produced by the method described in Example 10. 
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The wells are blocked so that non-specific binding of the polypeptide to the well is 
reduced. 

The coated weUs are then incubated for > 2 hours at RT with a sample 
containing the polypeptide. Preferably, serial dilutions of the sample should be used to 
5 validate results. The plates are then washed three times with deionized or distilled water 
to remove unbounded polypeptide. 

Next, 50 ul of specific antibody-alkaline phosphatase conjugate, at a 
concentration of 25-400 ng, is added and incubated for 2 hours at room temperature. 
The plates are again washed three times with deionized or distilled water to remove 
10 unbounded conjugate. 

Add 75 ul of 4-methylumbellifery 1 phosphate (MUP) or p-nitrophenyl 
phosphate (NPP) substrate solution to each well and incubate 1 hour at room 
temperature. Measure the reaction by a microtiter plate reader. Prepare a standard 
curve, using serial dilutions of a control sample, and plot polypeptide concentration on 
15 the X-axis (log scale) and fluorescence or absorbance of the Y-axis (linear scale). 

Interpolate the concentration of the polypeptide in the sample using the standard curve. 

Example 23: Formulating a Polypeptide 

The secreted polypeptide composition will be formulated and dosed in a fashion 

20 consistent with good medical practice, taking into account the clinical condition of the 
individual patient (especially the side effects of treatment with the secreted polypeptide 
alone), the site of delivery, the method of administration, the scheduling of 
administration, and other factors known to practitioners. The "effective amount" for 
purposes herein is thus determined by such considerations. 

25 As a general proposition, the total pharmaceutically effective amount of secreted 

polypeptide administered parenterally per dose will be in the range of about 1 fig/kg/day 
to 10 mg/kg/day of patient body weight, although, as noted above, this will be subject 
to therapeutic discretion. More preferably, this dose is at least 0.01 mg/kg/day, and 
most preferably for humans between about 0.01 and 1 mg/kg/day for the hormone. If 

30 given continuously, the secreted polypeptide is typically administered at a dose rate of 
about 1 fig/kg/hour to about 50 ^ig/kg/hour, either by 1-4 injections per day or by 
continuous subcutaneous infusions, for example, using a mini-pump. An intravenous 
bag solution may also be employed. The length of treatment needed to observe changes 
and the interval following treatment for responses to occur appears to vary depending 

35 on the desired effect. 

Pharmaceutical compositions containing the secreted protein of the invention are 
administered orally, rectally, parenterally, intracistemally, intravaginally. 
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intraperitoneaUy, topically (as by powders, ointments, gels, drops or transdermal 
patch), bucally, or as an oral or nasal spray. Tharmaceutically acceptable carrier" refers 
to a non-toxic solid, semisolid or liquid fdler, diluent, encapsulating material or 
formulation auxiliary of any type. The term "parenteral" as used herein refers to modes 
5 of administration which mclude intravenous, intramuscular, intraperitoneal, intrastemal, 
subcutaneous and intraarticular injection and infusion. 

The secreted polypeptide is also suitably administered by sustained-release 
systems. Suitable examples of sustained-release compositions include semi-permeable 
polymer matrices in the form of shaped articles, e.g., Fdms, or mirocapsules. 

10 Sustained-release matrices include polylactides (U.S. Pat. No. 3,773,919, EP 58,481), 
copolymers of L-glutamic acid and gamma-ethyl-L-glutamate (Sidman, U. et al., 
Biopolymers 22:547-556 (1983)), poly (2- hydroxyethyl methacrylate) (R. Langer et 
al., J. Biomed. Mater. Res. 15:167-277 (1981), and R. Langer, Chem. Tech. 12:98- 
105 (1982)), ethylene vinyl acetate (R. Langer et al.) or poly-D- (-)-3-hydroxybutyric 

15 acid (EP 133,988). Sustained-release compositions also include liposomally entr^ped 
polypeptides. Liposomes containing the secreted polypeptide are prepared by methods 
known per se: DE 3,218,121; Epstein et al., Proc. NaU. Acad. Sci. USA 82:3688-3692 
(1985); Hwang et al., Proc. Nad. Acad. Sci. USA 77:4030-4034 (1980); EP 52,322; 
EP 36,676; EP 88,046; EP 143^9; EP 142.641; Japanese PaL AppL 83-1 18008; 

20 U.S. Pat. Nos. 4,485,045 and 4,544,545; and EP 102,324. Ordinarily, the liposomes 
are of the small (about 200-800 Angstroms) unilamellar type in which the lipid content 
is greater than about 30 mol. percent cholesterol, the selected proportion being adjusted 
for the optimal secreted polypeptide therapy. 

For parenteral administration, in one embodiment, the secreted polypeptide is 

25 formulated generally by mixing it at the desired degree of purity, in a unit dosage 

injectable form (solution, suspension, or emulsion), with a pharmaceutically acceptable 
carrier, i.e., one that is non-toxic to recipients at the dosages and concentrations 
employed and is compatible with other ingredients of the formulation. For example, the 
formulation preferably does not include oxidizing agents and other compounds that are 

30 known to be deleterious to polypeptides. 

Generally, the formulations are prepared by contacting the polypeptide 
uniformly and intimately with liquid carriers or finely divided solid carriers or both. 
Then, if necessary, the product is shaped into the desired formulation. Preferably the 
carrier is a parenteral carrier, more preferably a solution that is isotonic with the blood 

35 of the recipient. Examples of such carrier vehicles include water, saline. Ringer's 
solution, and dextrose solution. Non-aqueous vehicles such as fixed oils and ethyl 
oleate are also useful herein, as well as liposomes. 
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The carrier suitably contains minor amounts of additives such as substances that 
enhance isotonicity and chemical stability. Such materials are non-toxic to recipients at 
the dosages and concentrations employed, and include buffers such as phosphate, 
citrate, succinate, acetic acid, and other organic acids or their salts; antioxidants such as 
5 ascorbic acid; low molecular weight (less than about ten residues) polypeptides, e.g., 
polyarginine or tripeptides; proteins, such as serum albumin, gelatin, or 
inmiunoglobulins; hydrophilic polymers such as polyvinylpyrrolidone; amino acids, 
such as glycine, glutamic acid, aspartic acid, or arginine; monosaccharides, 
disaccharides, and other carbohydrates including cellulose or its derivatives, glucose, 

10 manose, or dextrins; chelating agents such as EDTA; sugar alcohols such as mannitol or 
sorbitol; counterions such as sodium; and/or nonionic surfactants such as polysorbates, 
poloxamers, or PEG. 

The secreted polypeptide is typically formulated in such vehicles at a 
concentration of about 0.1 mg/nJ to 100 mg/ml, preferably 1-10 mg/ml, at a pH of 

15 about 3 to 8. It will be understood that the use of certain of the foregoing excipients, 
carriers, or stabilizers will result in the formation of polypeptide salts. 

Any polypeptide to be used for therapeutic administration can be sterile. 
Sterility is readily accomplished by filtration through sterile filtration membranes (e.g., 
0.2 micron membranes). Therapeutic polypeptide compositions generally are placed 

20 into a container having a sterile access port, for example, an intravenous solution bag or 
vial having a stopper pierceable by a hypodermic injection needle. 

Polypeptides ordinarily will be stored in unit or multi-dose containers, for 
example, sealed ampoules or vials, as an aqueous solution or as a lyophilized 
formulation for reconstitution. As an example of a lyophilized formulation, 10-ml vials 

25 are filled with 5 ml of sterile-filtered 1% (w/v) aqueous polypeptide solution, and the 
resulting mixture is lyophilized. The infusion solution is prepared by reconstituting the 
lyophilized polypeptide using bacteriostatic Water-for-Injection. 

The invention also provides a pharmaceutical pack or kit comprising one or 
more containers fdled with one or more of the ingredients of the pharmaceutical 

30 compositions of the invention. Associated with such container(s) can be a notice in the 
form prescribed by a governmental agency regulating the manufacture, use or sale of 
pharmaceuticals or biological products, which notice reflects q)proval by the agency of 
manufacture, use or sale for human administration. In addition, the polypeptides of the 
present invention may be employed in conjunction with other then^utic compounds. 

35 
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Example 24: Method of Treating Decreased Levels of the Polypeptide 

It will be appreciated that conditions caused by a decrease in the standard or 
normal expression level of a secreted protein in an individual can be treated by 
administering the polypeptide of the present invention, preferably in the secreted form. 
5 Thus, the invention also provides a method of treatment of an individual in need of an 
increased level of the polypeptide comprising administering to such an individual a 
pharmaceutical composition comprising an amount of the polypeptide to increase the 
activity level of the polypeptide in such an individual. 

For exanq)le, a patient with decreased levels of a polypeptide receives a daily 
10 dose OJ-100 ug/kg of the polypeptide for six consecutive days. Preferably, the 

polypeptide is in the secreted form. The exact details of the dosing scheme, based on 
administration and formulation, are provided in Example 23. 

Example 25: Method of Treating Increased Levels of the Polypeptide 

1 5 Antisense technology is used to inhibit production of a polypeptide of the 

present invention. This technology is one example of a method of decreasing levels of 
a polypeptide, preferably a secreted form, due to a variety of etiologies, such as cancer. 

For example, a patient diagnosed with abnormally increased levels of a 
polypeptide is administered intravenously antisense polynucleotides at 0.5, 1.0, 1.5, 

20 2.0 and 3.0 mg/kg day for 21 days. This treatment is repeated after a 7-day rest period 
if the treatment was well tolerated. The formulation of the antisense polynucleotide is 
provided in Example 23. 

Example 26: Method of Treatment Using Gene Therapy 
25 One method of gene therapy transplants fibroblasts, which are enable of 

expressing a polypeptide, onto a patient. Generally, fibroblasts are obtained from a 
subject by skin biopsy. The resulting tissue is placed m tissue-culture mediimi and 
separated into small pieces. Small chunks of the tissue are placed on a wet surface of a 
tissue culture flask, £q[)proximately ten pieces are placed in each flask. The flask is 
30 turned upside down, closed tight and left at room temperature over night. After 24 

hours at room temperature, the flask is inverted and the chunks of tissue remain fixed to 
the bottom of the flask and fresh media (e.g.. Ham's F12 media, with 10% PBS, 

penicillin and streptomycin) is added. The flasks are then incubated at 3TC for 

approximately one week. 
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At this time, fresh media is added and subsequently changed every several days. 
After an additional two weeks in culture, a monolayer of fibroblasts emerge. The 
monolayer is trypsinized and scaled into larger flasks. 

pMV-7 (Kirschmeier, P.T. et al., DNA, 7:219-25 (1988)), flanked by the long 
5 terminal repeats of the Moloney murine sarcoma virus, is digested with EcoRI and 
Hindin and subsequendy treated with calf intestinal phosphatase. The linear vector is 
fractionated on agarose gel and purified, using glass beads. 

The cDNA encoding a polypeptide of the present invention can be amplified 
using PGR primers which correspond to the 5* and 3' end sequences respectively as set 
10 forth in Example 1. Preferably, the 5* primer contains an EcoRI site and the 3* primer 
includes a Hindlll site. Equal quantities of the Moloney murine sarcoma virus Hnear 
backbone and the amplified EcoRI and Hindin fragment are added together, in the 
presence of T4 DNA ligase. The resulting mixture is maintained under conditions 
appropriate for ligation of the two fragments. The ligation mixture is then used to 
1 5 transform bacteria HB 101 , which are then plated onto agar containing kanamycin for 
the purpose of confirming that the vector has the gene of interest properly inserted. 

The amphotropic pA317 or GP+aml2 packaging cells are grown in tissue 
culture to confluent density in Dulbecco's Modified Eagles Medium (DMEM) with 10% 
calf serum (CS), penicillin and streptomycin. The MSV vector containing the gene is 
20 then added to the media and the packaging cells transduced with the vector. The 
packaging cells now produce infectious viral particles containing the gene (the 
packaging cells are now referred to as producer cells). 

Fresh media is added to the transduced producer cells, and subsequendy, the 
media is harvested firom a 10 cm plate of confluent producer cells. The spent media, 
25 containing the infectious viral particles, is filtered through a millipore filter to remove 
detached producer cells and this media is then used to infect fibroblast cells. Media is 
removed bom a sub-confluent plate of fibroblasts and quickly replaced with the media 
from the producer cells. This media is removed and replaced with fiesh media. If the 
titer of virus is high, then virtually all fibroblasts will be infected and no selection is 
30 required. If the titer is very low, then it is necessary to use a retroviral vector that has a 
selectable marker, such as neo or his. Once the fibroblasts have been efficienUy 
infected, the fibroblasts are analyzed to determine whether protein is produced. 

The engineered fibroblasts are then transplanted onto the host, either alone or 
after having been grown to confluence on cy todex 3 microcarrier beads. 

35 
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Exam ple 27: Method of Treatment Using Gene Therapy - In Vivo 

Another aspect of the present invention is using in vivo gene therapy 
methods to treat disorders, diseases and conditions. The gene therapy method 
relates to the introduction of naked nucleic acid (DNA, RNA, and antisense 
5 DNA or RNA) sequences into an animal to increase or decrease the expression 
of the polypeptide of the present invention. A polynucleotide of the present 
invention may be operatively linked to a promoter or any other genetic elements 
necessary for the expression of the encoded polypeptide by the target tissue. 
Such gene therapy and delivery techniques and methods are known in the art, 
10 see, for example, WO90/1 1092, W098/1 1779; U.S. Patent NO. 5693622. 
5705151, 5580859; Tabata H. et al. (1997) Cardiovasc. Res. 35(3):470-479, 
Chao J et al. (1997) Pharmacol. Res. 35(6):5 17-522, Wolff J.A. (1997) 
Neuromuscul. Disord. 7(5):314-318, Schwartz B. et al. (1996) Gene Ther. 
3(5):405-41 1, Tsurumi Y. et al. (1996) Circulation 94(12):3281-3290 
1 5 (incorporated herein by reference). 

The polynucleotide constructs of the present invention may be delivered 
by any method that delivers injectable materials to the cells of an animal, such 
as, injection into the interstitial space of tissues (heart, muscle, skin, lung, liver, 
intestine and the like). These polynucleotide constracts can be delivered in a 
20 pharmaceuiically acceptable liquid or aqueous carrier. 

The term "naked" polynucleotide, DNA or RNA, refers to sequences 
that arc free from any delivery vehicle that acts to assist, promote, or faciUtate 
entry into the cell, including viral sequences, viral particles, liposome 
formulations, lipofectin or precipitating agents and the like. However, the 
25 polynucleotides may also be delivered in liposome formulations (such as those 
taught m Feigner P.L. et al. (1995) Ann. NY Acad. Sci. 772:126-139 and 
Abdallah B. et al. (1995) Biol. Cell 85(1): 1-7) which can be prepared by 
methods well known to those skilled in the art 

The polynucleotide vector constructs of die present invention used in 
30 the gene therapy method are preferably constmcts that will not integrate into the 
host genome nor will they contain sequences that allow for replication. Any 
strong promoter known to those skilled in the art can be used for driving the 
expression of DNA. Unlike other gene therapies techniques, one major 
advantage of introducing naked nucleic acid sequences into target cells is the 
35 transitory nature of the polynucleotide synthesis in the cells. Studies have 
shown that non-replicating DNA sequences can be introduced into cells to 
provide production of the desired polypeptide for periods of up to six months. 
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The polynucleotide construct of the present invention can be delivered to 
the interstitial space of tissues within the an animal, including of muscle, skin, 
brain, lung, liver, spleen, bone marrow, thymus, heart, lymph, blood, bone, 
cartilage, pancreas, kidney, gall bladder, stomach, intestine, testis, ovary, 
5 uterus, rectum, nervous system, eye, gland, and connective tissue. Interstitial 
space of the tissues comprises the intercellular fluid, mucopolysaccharide matrix 
among the reticular fibers of organ tissues, elastic fibers in the walls of vessels or 
chambers, collagen fibers of fibrous tissues, or that same matrix within 
connective tissue ensheathing muscle cells or in the lacunae of bone. It is 
10 similarly the space occupied by the plasma of the circulation and the lymph fluid 
of the lymphatic channels. Delivery to the interstitial space of muscle tissue is 
preferred for the reasons discussed below. They may be conveniently delivered 
by injection into the tissues comprising these cells. They are preferably delivered 
to and expressed m persistent, non-dividing cells which are differentiated, 
15 although delivery and expression may be achieved in non-differentiated or less 
completely differentiated cells, such as, for example, stem cells of blood or skin 
fibroblasts. In vivo muscle cells are particularly competent in their ability to take 
up and express polynucleotides. 

For the naked polynucleotide injection, an effective dosage amount of 
20 DNA or RNA will be in the range of firom about 0.05 g/kg body weight to about 
50 mg/kg body weight Preferably the dosage will be from about 0.005 mg/kg 
to about 20 mg/kg and more preferably firom about 0.05 mg/kg to about 5 mg/kg. 
Of course, as the artisan of^ordinary skill will appreciate, this dosage will vary 
according to the tissue site of injection. The appropriate and effective dosage of 
25 nucleic acid sequence can readily be determined by those of ordinary skill in the 
art and may depend on the condition being treated and the route of 
administration. The preferred route of administration is by the parenteral route of 
injection into the interstitial space of tissues. However, other parenteral routes 
may also be used, such as, inhalation of an aerosol formulation particularly for 
30 delivery to lungs or bronchial tissues, throat or mucous membranes of the nose. 
In addition, naked polynucleotide constructs can be delivered to arteries during 
angioplasty by the catheter used in the procedure. 

The dose response effects of injected polynucleotide in muscle in vivo is 
determined as follows. Suitable template DNA for production of mRNA coding 
35 for the polypeptide of the present invention is prepared in accordance with a 
standard recombinant DNA methodology. The template DNA, which may be 
either circular or Unear, is either used as naked DNA or complexed with 
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liposomes. The quadriceps muscles of mice are then injected with various 
amounts of the template DNA. 

Five to six week old female and male Balb/C mice are anesthetized by 
intraperitoneal injection with 0.3 ml of 2.5% Avertin. A 1 .5 cm incision is made 

5 on the anterior thigh, and the quadriceps muscle is direcdy visualized. The 

template DNA is mjected in 0. 1 ml of carrier in a 1 cc syringe through a 27 gauge 
needle over one minute, approximately 0.5 cm from the distal insertion site of the 
muscle into the knee and about 0.2 cm deep. A suture is placed over the 
injection site for future localization, and the skin is closed with stainless steel 

10 clips. 

After an appropriate incubation time (e.g., 7 days) muscle extracts are prepared 
by excising the entire quadriceps. Every fifth 15 um cross-section of the individual 
quadriceps muscles is histochemically stained for protein expression. A time course for 
protein expression may be done in a similar fashion except that quadriceps from 

15 different mice are harvested at different times. Persistence of DNA in muscle following 
injection may be determined by Southern blot analysis after preparing total cellular DNA 
and HIRT supematants from injected and control mice. The results of the above 
experimentation in mice can be use to extrapolate proper dosages and other treatment 
parameters in humans and other animals using naked DNA of the present invention. 

20 It will be clear that the invention may be practiced otherwise than as particularly 

described in the foregoing description and examples. Numerous modifications and 
variations of the present invention are possible in light of the above teachings and, 
therefore, are within the scQpe of the appended claims. 

The entire disclosure of each document cited (including patents, patent 

25 applications, journal articles, abstracts, laboratory manuals, books, or other 

disclosures) in the Background of the Invention, Detailed Description, and Examples is 
hereby incorporated herein by reference. 
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Sequence Listing 

(1) GENERAL INFORMATION: 

(i) APPLICANT: Human Genome Sciences, Inc.. et al. 

(ii) TITLE OF INVENTION: 207 Human Secreted Proteins 

(iii) NUMBER OF SEQUENCES: 800 

(iv) CORRESPCNDENCE ADDRESS: 

(A) ADDRESSEE: Human Genome Sciences, Inc. 

(B) STREET: 9410 Key West Avenue 

(C) CITY: Rockville 

(D) STATE: Maryland 

(E) COUNTRY: USA 

(F) ZIP: 20850 

tv) CCMPOTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette, 3:50 inch, 1.4Mb storage 

(B) COMPUTER: HP Vectra 486/33 

(C) OPERATING SYSTEM: MSDOS version 6.2 
<D) SOFTWARE: ASCII Text 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 
<B) FILING DATE: 
(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 
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15 



(viii) ATTORNEY/ AC^NT INFORMATION: 

(A) NAME: Kenley K. Hoover 

(B) REGISTRATION NUMBER: 40,302 

(C) REFERENCE/DOCKET NUMBER: PZ007PCT 

(vi) TELECCMMUNICATICN INFORMATION: 

(A) TELEPHONE: (301) 309-8504 

(B) TELEFAX: (301) 309-8439 



20 



30 



40 



<2) INFORMATION FOR SEQ ID NO: 1: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LaiGTH: 733 beise pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCMPTION: SEQ ID NO: 1: 

GC3GATCCGGA GCCCAAATCT TCTGACAAAA CTCACACATG CCCACCGrTGC CCAGCACCTG 60 

AATTCGAGQG TGCACCGTCA GTCTTCCTCT TCCCCCCAAA ACCCAAGGAC ACOCTCATGA 120 

35 TCTCCCGGAC TCCTGAQGTC ACATGCGTQG TGGTGGACGT AAGCCAOGAA GACCCTGAGG 180 

TCAACTTCAA CTGGTACGTG GACGOCGTGG AGGTGCATAA TGCCAAGACA AAGCCGCGGG 240 

AQGAGCACTA CAACAGCAOG TACCGTGrTGG TCAGCGTCCT CACCGTCCTG CACCftGGACT 300 

GQCTGAATGG CAAGGACTAC AAGTGCAAGG TCTCCAACAA AGCCCTCCCA ACCCCCflTCG 360 

AGAAAACCAT CTCCAAAGCC AAAGGGCAGC CCCGAGAACC ACAGGTOTAC ACCCTGCCCC 420 

45 CftTCCCGGGA TGAGCTGACC AA3AACCAGG TCAGCCTGAC CTGOCTQGTC AAAGCSCTPCT 480 

ATCCAAGCGA CATOGCOC?IG GAGTCGGAGA GCAATGQGCA GCCGGftGAAC AACTACAfiGA 540 

CCACGCCTCC CGrrcCTGGAC TCCGACQGCT CCrTCTTCCT CTACAGCAAG CTCACCGTOG 600 

50 

ACAAGAGCAG GTOGCAGCAG GC3GAACC3TCT TCTCATGCTC OGTOATQCAT GAGGCTCTGC 660 

ACAACCACTA CAOGCAGAAG AGCCTCTCCC TGTCTCOQGG TAAATGAGTG OGACGGCCGC 720 

55 GACTCTAGAG GAT '^^^ 



60 (2) INFORMATION FOR SEQ ID NO: 2: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Trp Ser Xaa Trp Ser 
1 5 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 86 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
GCGCCTCGAG ATTTCCCCGA AATCTAGATT TCCCCGAAAT GATTTCCCCG AAATGATTTC 
CCCGAAATAT CTGCCATCTC AATTAG 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: niicleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
GCGGCAAGCT TTTTGCAAAG (XTAGGC 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 271 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 



CTCGAGATTT CCCCGAAATC TAGATTTCCC CGAAATGATT TCCCCGAAAT GATTTCCCCG 
AAATATCIX5C CATCTCAATT AGTCAGCAAC CATAGTCCCG CCCCTAACTC CGCCCATCCC 
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GCXXXTAACT CCGCCCAGTT CCGCCCATTC TCCGOXCAT GGCTGACTAA TTTTTTTTAT 180 
TTA-rcCAGAG GCCGAGGCCG CCTCGGCCTC TGAGCTATTC CAGAAGTAGT GAGGAGGCTT 240 
5 TTTTGGAGGC CTAGGCTTTT GCAAAAAGCT T 271 



10 (2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base paixs 

(B) TYPE: nucleic acid 
15 (C) STRANDEDNESS : doioble 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTICN: SBQ ID NO: 6: 
20 GCGCTCGAGG GATGACAGCG ATAGAACCCC GG 



25 (2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEENESS: double 

(D) TOPOLOGY: linear 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
GCGAAGCITC GCGACTCCCC GGATCCGCCT C 31 



40 

(2> INPORMATICN FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTEEaSTICS : 

(A) LENC3TH: 12 base pairs 
45 (B) TVTC: nucleic acid 

(C) STRANDBDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
GGGGACTTTC CC 



55 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 73 base pairs 
60 (B) TYPE: nucleic acid 
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(C) STRANDEDNESS : <3oxible 

(D) TOPOUXSY: linear 

(xi) SEQXJENCE DESCRIPTION: SEQ ID NO: 9: 
GCGGCXrrCGA GGGGACTTTC CXXX3GGACTT TCCGGGGACT TTCCGGGACT TTCCATCCTG 60 
CCATCTCAAT TAG 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 256 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECaOESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

CTCGAGGGGA CTTTCCCGGG GACTTTCCGG GGACTTTCCG GGACTTTCCA TCTGCCATCT 60 

CAATTAC?rcA GCAACCATAG TCCCGCCCCT AACTCCGCCC ATCCCGCCCC TAACTCCGCC 120 

CACyPTCCGCC CATTCTCCGC CCCATGGCTG ACTAATTTTT TTTATTTATG CAGAGGCCGA 180 

GGCCGCCTCG GCCTCTGAGC TATTCCAGAA GTAGTGAGGA GGCTTrTTTG GAGGCCTAGG 240 
CTTTTGCAAA AAGCTT 



256 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2526 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
(D> TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

GACAGGCTAT CXGAGAATCT GAGAGCTOGG CCCGGCAATT CTTCCAGYTA CCCTTGTGAC 60 

CTAAGTCCAG TCACACATTT CCCAAAGITT CTCTTTGrCA TAACCCTGGT (rTGQCTGGTT , 120 

TIGRGGRCTT GAGAATGGGT CAGGGACTCC AGGCCAAGTC CAACAGAGAC CCCAAACCCA 

CCACACACCA GCAGCCACAA CCTCACCACC AACAAAGAGG ACTTTTGTGG QGCCACAAGT 

AAGAQGTCAT TICTOGAATC GACTCAGACC TTTAAACAGG AGAGTTGAGC ACTTCCAGKS 

AGrrrTTAAG CAAGGCATGG GGAACAGGGA ATAGAACCTT TCAAAGAGGT TGCCCAGAGA 360 

AAAGCTGQGC CTCTIGCATT GGGCTTCCTT GGAGCAGCCT CTTCTGGCAG AAAGCCATCA 420 

GGTGCTCAAT CATCTTCTCC TGGCCAAGGC TCTGACCATG CTTAGTACTG GAATAGAGGT 



180 
240 
300 



480 
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C3GCCAGC3CCC CCAGCGACTC TTCITOGCCT GATGlTrGTC CTCACAGGCA TCCCAOSTGG 540 

CCTCAGATGA TTCAGAACAA ATCATCCTAA CTTTGAATCC ATCCAGCCAC TTGCAAATGA 600 

^ TAATCAGAAG TO^SCTTCTT CACTGTTAGA AAGAAACTAA CAAAAGAGAA CCCAGAGCAA 660 

TCTAGAATCT TTCAGTCCTr GCSCTTTCCAA GGATACTGCG GAGACTCTCG CCAAGCTGAT 720 

10 GARKHTCTGA ARTOTCACTG GCACCATATG CAACAAGAAC CACCATTCAC TGAGTAGCTA 780 

ATOGCJITTCG GGCCTCGGAC ATTCCATCTG AGGTCCTTCC TGAACATGTC ACTCCACAGC 840 

AGAGGACCGG TTGCAGCTTA CXXIAGAACCA CTCCTCCAGG AGAGCTGGAT GTTTTGCGTC 900 

CAACACCTTC AGCACTGACT GCTATTGriTC AAAAAAAGCC TTTCCTGCAT TCGGAGGACT 960 

GCCCX^TTCCC CTCAGG3X3AC TTCCTAACTA TOTOCJETTCA TTAGCGAATT TATTTTTTCT 1020 

20 GCTGGOTGGA CATTTGTATT TTGTTAGGTT GCTGrTTTAAG CTCAftGTTTG CTGTGCTCTC 1080 

TGCAGCTACA AAACATCTTC GCATATTTAA GAKTGGCTTT TATAAATAGC TTTATTCTGA 1140 

TATTAATCAG ATICCCAACT TTACTGAGAA TTAAGGACTG GGGTACTTTA AAGAAATGCA 1200 

25 

AATAGCAATT GAAGAACCAC TCCTGCAGGT GGTAGCCCTG GCTAGACTGA ATTACACTAG 1260 

AAATCAGCCA GAAGGAftGCG TCCrTGGGAT CCCAGATCAC TCrmTTTTT TTTTTTTTTA 1320 

30 AAAGGGGCAG CCCCTTGATG GCTCATCTCT CTGAATAACA GrTTACGXCTT CATATCGATA 1380 

CCAGATGCCT TCTTCATCAT GCCACTGAAG CCACTCACCA CCTTCAAGAA CATGCCAACC 1440 

TCTOTCAGAT TCACTTACCC ACAAACAAGG AGGCACGTTT GGCACAAAGT GTTGTCCTCC 1500 

35 

AGCTtXAACT GGACTCTACA GftGTGCTTGA CCTCAACACA CTGGATTCCA GGTGGACTGG 1560 

ACCAAGAGCA GGCAAAGACA CGGGAACTGA AAAACTCCAC AGGGTTTGGA GAATAGAAAT 1620 

40 GAAAflGCCAC GTCATATAAC TCAAGAATAA ATOGTGTTTT GGAAATTTTA AAATEATCAT 1680 

CGAAGOT3GT GAAACTATTT CAGGCCXMA TGAAAGGAAA TCGCCAGrTTG GGGATGAAAT 1740 

CACAGAGCCT GrXXTTTTTATG ATATOC?rTGG ATOTOCACPG ATGAAATTTT AAAGGAGrTTT 1800 

45 

CATETrrAAA AGTGCGCATO ATTCTACATA TGAGAATTCT TTAGGCCAftG AAACTGTCCT 1860 
TGGCTCAGAG C?ICTTGGGAA TTAAAGCAGA GAGAAGCCAT TCGTCATGCT TAGAACXIAAG 1920 

50 GATCCTCATG TACACAAAGA CCATCGAGAC GGCCATTCTT GrTTTACAAAA CACTTACXIAA 1980 
GAAAGCACrr TOIAGGGGAA CTTTAGTAAG TTCTTCTCAT TTCATTATGT TTCTrCCAAG 2040 
GAAACAGGAG AGACTGAATT AATAATTCTC TCTTTCCTCT TAAGCACTTT TAAAATAATA 2100 
AAtTTACATCT TGAAATITCG GGGGGCATCT CTGATTrAAA AAAAGAAAAA QGCrGCTTGA 2160 
TOTATCTTAT GCAGAGACAC TCTTCCCTCTG GTGGCTGCflG AGCAATACCC AAGCCTCATT 2220 

60 TOGAAGGCTC AACAITIGGA ATTGCACTTr AATTGATTAA TCCTCAAITC ATGTGGCCTT 2280 
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ACGGGATCCT GGCTCTOQGA CCCCAATTCA TTCTTATCTG CCAAAGAATT ATCTAGAAGC 2340 

ACATCAAATA CCAGCACCCC ACCTGCACAA TGQGGGTGGA AAACTTTTGT ATCCCTAAGC 2400 

^ ATATTATrTT ATACTTCTCTG CCATC3CCATG TGGAAATACT TTATTTITAA CCTCAGGATT 2460 

TAAATAAACT AAACACTATG ACATETAAAA AAAAAAAAAA AAAACTCGAG GGGGGCCCGG 2520 

10 TACCCA ^^^^ 



30 



40 



50 



ACCaSGCTTA TTCTTCCTCT TACTTTA3CT CTGTATTC5CT CTTCCTCACT CTACTCCAGC 
CATCCCACCT CCTTGCTCCT TGTCCTATAC TCCTAAAAGA AGTTCAGTCT TCCCTTATGA 



15 (2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) IiENGTH: 1131 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS : double 

(D) TOPOUOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

25 CflCIGCACCA GCTTTCTTAT CTGTAAAATG ATGATAATAC CAACACCTTC TTCTTGGGGT 60 

ACTGAAGATC AGAGAACATG ATATCTTGTAA AGTGCCITCC ACAATACCCA GAACATAGCA 120 

AACATCTEAAT GAATGTAGTA ATAGTAATTA TITrATTTTC TTTTGATTCA GrTTOGGACEA 180 

TCTTCAGCIG TAACAGAATA CCCAAAATAA CTGTTTTAAA CAAATTAAAG TTTWGTTGTG 240 

AACTrnCTT ACGAATTCAG ACAATCCAGG QCTTTTATAG ATGCACCAGG ATCAGCAGGT 300 

35 ACAAAGGCAT CTTTCCTCAT TTCTGCCAGT CTCAATGCAT GGGTTGCAAT CCAGARTCCA 360 

RGATGQCACT TCCAGCCCTG GTTACGCCCA TATTAGCACA CAGAAAGAAA GAGAAAGGGA 420 

TOTCCCTCrr CACTTTAATC ATAGCTCCCA CTAGATGCAC CCACTACTTC TGCTGATACT 480 

CCATTAGCTA ATCCTTGCTT ACATGGTrCAC ACTTACmTC CAGAGAGACA TGTCrGGACA 540 

OTCATOIGCT CAATTAATAT CCAAGTGTCC AATTACTGAG AAAAAAAGAA ACTAGCACCT 600 

45 TitxrnxjG n T GCArrccrcr tagcataagc cacattcttt ttatgaagtt GrrccrcAGrr 660 

ACTTGGAroC CICAGTIGTC CTTTCAWrTA GAAAWCSOfCC TKGGACAYCC TGAAWCTGAC 720 

ircmrorc atcagcacca tcactaccac tgccytcttc aaagccacca cgttctgtcc 780 

CCAGGATGGT TGCAACAACC ACCATAGGGA CTTTTTGCCT TCTACTTCCA CACAATAGNC 840 
CAGAffTAAGC TTTIGAAAAT GTAGGTCAGA TCATGTCTCT CTCITCCTCT TCAAAACCCT 



900 



55 CCCGA'TOGCT nTCATATTA CTCAAAAGAA AACCTAAAAC TTTGCTGTGA GATCTATGTG 960 



1020 
1080 
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TATTTCCACT TAAAATAGAA AAAAAAAAAA AAAAAAAACT CGAGGGGGGC C 



1131 



10 



15 



20 



25 



30 



35 



40 



45 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 941 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
GGCACGAGTA GCATTTCATT TAATCTGCAG GTATATTCTC CCAACAGTTT ATTGTCATGT 
GATCnCCrCA GCCAAGATTG TRAGC3CAGAG AGGAGCTGTC CCAACCTACT ATACCACCGA 
GGCTGGAGAG ATCATATTTT TGCTTATTAAA CTGGAGrrCTC TCCATCCTTC ACATTCTTGA 
TCTCCTCTGT AGCAAACCGG AAAAGTCAGT GACAGAAGAT GCCGCTAGCG GTTTGAGCCA 
GAGAATGACA GCrcrGGTTT GCaCAAAAGG GCCGGATGGT GGCTCTAGAA AGCCCATCCT 

TCTCcrcnc ri ' iTriCT CC cccttatatt gtgctttcat tcattcattc attcatcaaa 
CAOrnGTTCA gcacctatta tcstgtcaagc tctgtgctag CCTCTGGAAA ACCroCCCTC 
ATOEAGCTCA CTGTGGAGTA GGAGAAACAA TGACTACACT ATGATAAGCA CGQGTTGrrCA 
GGOTCTCACA GAGCAGTGGC CCCTCATCCA GACCGATGAG GTCAAAGAAG GCATCCAGGC 
GAGGATGGTG TCAGAGCTAA CTGAAGAATG AGAGGGAGCT GCACCASCAG GGGTXGGAAC 
TCAAGC?roGC ACTGCCTGGA GTCTTGAITC CAGCAGAGGG AGAGCAGTCT GTGAAAAGGC 
ACCAAGGCyiG GGAGAGGGCA GAGCACATGG AGGAACTTCA GGTAGTTCTG GATGGCSCTG 
GGGCAAAGCT AGAGAGGTAA GAAGAATCTA CAAATGTTCC TCGAGTTACA TGAACTTCCA 
TCCCAATAAA CCCATTQGAA ACGAAAAATT TAAGTICAGAA GTGCATTTAA GGCTGGTCCG 

actagaatoa tttttacaac gaattgatca caaccagtta cagatgtctt tgttccttct 
ccactcccac tgcttcacct gactagcctt taaaaaaaaa a 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
941 
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(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SBC^JENCE CHARACTERISTICS: 

(A) LENGTH: 843 base pairs 
55 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) T0PQUX3Y: linear 



60 



(xi) SBC^JENCE DESCRIPTION: SBQ ID NO: 14: 



10 
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CNAGGGATAA CCCCAAAGOT GGGAAATAAA CCCTCAATTA AAGGGGGAAC CAAAAAGCTG 60 

GGAAGTTCCC CCCCGCGGTG GCGGCCNGOT CTAGGAACTA GTGGAATCCX: CCGQGGCTGC 120 

AGGGAATTCG GCACGGAGTG GGAATGTTGT TTGTATGATA CTATTTCCAC AAWATGCATT 130 

GAGACITOGT KTOTQGCCTA GGACATGGTC AATTCTTTYT AAATATTCCG TGAATTTCTT 240 
TAGTGCATAT TCTCCGATGG GGGCTGTGGG GACAGAGTTC TAAATATGCC CATTAGATTA - 300 

AATCTCTTCA TTCTCTTOCT CACATCTTCT ATATCCTTAT TAATCTGTCA ATCTCTTCAA 360 

GAGAGCTTCTT ATTAAAATCT CTCACrGTAT GTGTCACTTT GCCCTTAAAA TTCTGATGAT 420 

15 TTGCTTTATA AATQGTTATA ACCATTTrCC AGGAAGAACA TTAAAGAACT TTCCATTGGC 480 

ATTATCCAGT TTCCCTCAAA ATACTGGTTT TTTTTATTTT QGCTNCTAAG CAGCTATGAA 540 

TCCA£?rTrcr CAGAAGCXCT TGrrcrCAAGG CATTTGTTTC CAGATTACCT TGTTAGCATC 600 

20 

CACACTAItSG GCTATTTTAG AAAAACAAAA AAAGTATCAA AATCATATAG CTATGATTTT • 660 

CCTOTCCTTC AAGGAGCCTT AAAGCTCATC TAGTCCAGCC AGTATTTGTT CATCCAAATT 720 

25 CTCCCAftGAA ATCTCTATTG TCAAGATATT CTTTACCATC TTTGQGACAT TCTCATTATT 780 

AGAAACAAAT CCTAAGAAGA AATTCTGCCA TAKACAACCC ATCOGTTCJTT TAAAAAAAAA 340 
AAA 

30 



35 



45 



55 



(2) INFORMAriON FOR SEQ ID NO: 15: 



(i) SEC^JENCE CHAjRACTEElISTICS : 

(A) LENGTH:^ 1018 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: double 
40 (D) TOPOLOGY: linear 



343 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

CTOTAATrrr TAATTTTCAT ATACCGTGCT TTGATTCTAA TTTTATTTTT TGftGrrrCTCT 60 

GAAGOTTACA TATACAGAGT GCTTCAGGAA TGATCATTTT GrTATTaTTC ATGCTTCTTA 120 

ACAATCTTTCT TTTAGTCCAA GAAGATAATT GCCAGAGAAA GAATACAGTG CAGGAAAGAA 180 

50 GARGCTGGAG CCAGTGGrTGA AGARGGATTG AGARGACAGA CATTGTGGGA ATGAAATCAT 240 

GAATAATCGT GTnTTGAAT TeTCCAAAAA CTTCTACAAA CCATGAAATG TTGGAGTrTA 300 

AATCTAATTC TTCAAAAATT CCCCACATTC CTTGrTATCCC TTAGGTTGflG CATAATTCCA 360 

CATCCCnGGA CIGATGCACT TCCCAAGAGG GGGCCTCATT AACTCTTCCG AGGCAGCAGC 420 
AOCAAGGGCA CCCCCTCCTT TCCCCCCACA CCCCAYTTCT CATGGCTCTT CTTTCTCECA 



480 



60 TCPCATTCCTT AGGTTAGAAA AGQGCACAAG GTAAGGAAGC CCTTGGGAAT AGGCTGAATC 540 
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TQGCTATCTA AnTCGTCCC AAATACTTAA TCTGCTTGAA TTTAAAAACA GCAAACATCT 600 

AGAAAGCTTAA TTATAATTAT GAGGCCAGTT CTTTAAGCTA GCTTrmTC CTCTCTCAAA 660 

^ CAGCATATTC GCTTCGATOT CftGCAGGAGA AAGTCTTTTT TGCAATACAC ATAATGCATA 720 
TATOOTCCTG TTAGCAATCT ATAGAAAATA GATATTGCTC ATTAAGGTAA ATATTTTTGT 
10 TCATGAATGA TCTGGAATGG TCTOGACTTG TTGrTGTGAAC AGGAAATTGC TCTGrTAGGCT 

TTCACirorc AGGTAAAGAG TGAGGCTGGT AAGATTAATT AAAGTAAATA CTGTGACAAT 900 

AGGATOTCAA AACCAAAAAC GTOTTTCTGA AACTCAAGGA ATTAATGACA CATAGGGAAG 960 

TTTTTCCCAT ATTAAGCATA GAGTAGGAGA GGCAAGTCAA GAATAAAAAA AAAAAAAA 1018 



15 



20 



30 



40 



50 



55 



(2) INFXDRMATIQN FOR SEQ ID NO: 16: 



(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 661 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



N 



(2) INFORMATION FOR SEQ ID NO: 17: 
60 (i) SEQUENCE CHARACTERISTICS: 



780 
840 



240 
300 



(xi) SEQUENCE nESCRIPTION: SEQ ID ITO: 16: 

TTTAAGAAAT TAGTCAATCC CCGCamSCAG GGAATTOGGC ACGAGGAGGA GGCCGTCAGC 60 

TOGCAGGAQC GCAGGATCGC AGCTGYTCCC CCGGGTTGCA CCCCCCCAGY TCTGCTGGAC 120 

35 ATAAGYTGCT TAACAGAGAG CCTGGGAGCT GGGCAGCCTG TACCTGTGGA GTGCCGGCAC 180 
CGCCTGGAGG TGGCTGGGCC AAG(iAAGGGG GCTCIGAGCC CAGCATGGAT GCCTGCCTAT 
GCCTCCCAGC GCCCTACGCC CCTCACACAC CACAACACTG GCCIOTCCGA GCTGCTC3GAG 

CATGGAGICT CTGAGGAGGT GGAGAGAGTT CGGCGCPCAG AGAGGTACCA GACCATGAAG 360 

CTIGCGCAGGG CAGGGCTCGG ACCTACCCCA GGAATGTCCT GCCCTGGGAA TGACAACACA 420 

45 GTCCACACCA TGCACGGGGA GGCAAACAGG GGCAGCTGAC CCAGCCCAGG GGTCAGANGA 480 

GGTCTTCCCG AGGAACJTGGC AGCTAAGCTG ATACCTGATA TGCACWAGKC AGCCARGYGG 540 

AGACAGGCAA GGAAGAAGCT TGTTTTGAGG ACAGAATTIT CTAGATCACT CAGCACCATC 600 

TOGCmTCG GGCrmTOT TTTATTTTGT TTTTGAGACG GGGTCTCGCT CICTCGCCCA 660 

661 
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(A) LENGTH: 553 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : doxable 

(D) TOPOLOGY: linear 

5 

(xi) SEC2UENCE DESCRIPTICW: SEQ ID NO: 17: 
GGCACAGGGC TATTTGCCCC TCTCTCCACA TGACAGAACT GCTCTAAGTT TCTTTGCTGC 60 

10 TcrrcTCAGC tctcagacgg cttqctgcit gitttccaca ccaccatgtc tattctttcx: 120 

TCTCCTTOAC TCroCCTXOT TTTTTCCTrT TGTATTTCTT CTGGCTCTTG TCCCTTTTCC 



15 



25 



30 



40 



50 



(2) INFORMATIC»J FOR SEQ ID NO: 18: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 869 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



180 



CACGTOTCWC AGCTITCCTT TATTGCCACT TTCAGTCAGA GCAGTCCrGT GCTTCTGGTG 240 



300 



CCGGCATACA ATACTTACTT GAGTTTCTTG GCTTTTCTTG ACTGTGCATC TCTTACTTCA 

ACATAGGAAT AGCCTOTCAT AGAATTTCTC CAGTTCCAGG GCTCAAGAGG GAGAGTGCCA 360 

20 GAAAATTCAG ACTOTTTTCC CTGTCTTGGA TTGAATTCAT AAAGCAAAAC CAGTGTTTGT 420 

GTCAGGOTrr GCTGTGTCAT GCCTATAGGT TGTTTGGGTG CAAACCTATA GAATCCAGCC 480 

TOCGAAAAGA AAGRAACCAG AGAATANCAG CATCAGAACA ATGCTTGACA TCATTTCTCA 540 
ATCAAGCAGT CCA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
GGCACGAGCT GCCAACACTG AGGTCTTCGT GGCTTCTCAC ATCTAGATGT ATCCCTCTCA 60 
AATCTATCCT CTATCCAGGC ACCAGATPGA GGTATCTAAA ATGTCAACTT TCCAGTTACT 120 
45 CCTTCTTATA CTAGCCCAAT CAACTTACAA GATAAAGTCC AAGCCCCTTC ATATGACAAA 
CCACACCCTC CTTAACTCTC CAGGTTTGAA TOCTrCATCT CCTACrTTAA ACTTTAAAAC 
CCAGCAGCAC GAAAOTTTCT CCTATGCATG TTGCCATATG CGTTCTCTCC ATCATGCATT 300 
TGCCIGAGCA AGATCTCTTC AGTTAACATC TTATTCTTTA AGACPCATTG TGGTGGTAGA 360 
CAGCCTTTAA TAACGGA3CC TTGGCCAGGC ACAGTGACTC ACACCTGTAA TCCCAGAACT 420 
55 TTCAAAGGCC AAAGAAQGAA GAAAGCTTGA OGCCAGTAGT TTGAGACCAG OCTGGGAAAC 



180 
240 



480 



60 



AGAGAGATAT CCCATCTGTA CCAAAAATTT AAAAAAATAT TAGCAGGGAG TAGrTGGCATG 540 
CACAAGIGOT CCCAGCTCCA TGGGAGASTG AGGTAGGAAC ATCACTTGAG CCCAGGAAGT 600 
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CAAGGCTGCA GTGAACCATG ATCAGAACAT TGCANTCCAG 


CTTQGGTAAC AGAGTGAGftC 


660, 


CTTAGGTCAG AAAAATGAAT AAATAAGCAT AAAATTTTAA 


AAACTTAGCX: AGGCATQGTG 


720 


GCACACATCT GnGGTCCCTG CTACTTAGGA GGCTGAGGTG 


AGAGGATCCT TGAGCCCfiGG 


780 


AGCTTCAACAC TACAGTGAGC TATGATTGTG CCACTAAACT 


CCAACCTGGG TGAAAAAGCA 


840 


AAACCCTGCC AAAAAAAAAA AAAAAAACT 




- 869 



15 



25 



35 



45 



(2) INFORMATION FOR SEQ ID NO: 19: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LQJGTH: 959 base pciirs 

(B) TYPE: nucleic cicid 

(C) STEIANDEDNESS: dovible 
20 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 19: 

GGCGftGCCGA GATCOTGCCA TTOCACTCCA GCCTGC3GCAA CAAGAGTGAA ACTCICTCTC 60 

AAAAAAAAAA AATTATAATA CTATATQCCA TAAAATGACA TTTCATATTT AAAGAGTXTT 120 

TTAAAACrCT TCTATTCACA TGCCATAATT TGAAACCCTA TTTCACTGAA TGAGAATQGT 180 

30 ATCICTICTC CICATTTTTT CATETTEATC CTEAACAATT TCCACCACAG CCAGTGCATA 240 
TAATQGCAAT GACACCCAGG GATQGAATCA TAAGTTCCAT CRCMGCTCAG TCAAGACGCA 



AAGCTAAATA CCCGCTGCAC AAGAAACCAC AGCATCTAGG TTCTAACCCC ATCTCTATGA 
AGAGCrroCT GGGAGAGTTT TGACATIWAA CAATCTGTCT GATKGCCAAT TTmrCTTC 



300 



GACTTCATGT GGCCCCAACA ACACTTCAATA ATGGAGTCTC CAAAATAAAG CTCTATAGGA 360 



420 
480 



40 TATAAAATGA TAATGTTKGA YTCAAAGATC CAAAGTCAAT TCATGCTTCTA AAACTTAATG 540 



600 
660 



ATrmTTAG tfrrTTG KGA C ATTTCACTGT ACACTGTAGT AATTTATATC TTATTTTCCC 
ACTAArrrAG AAAAATATYT AAATGATCCT TAATTGGCAA TGGGTCCTAA GAArmGTT 
TTAAATCCCr CTEACCCAAA AGAGCCCTTT TTTGrEATCTC GCAGTAGrTTA GAAGGATCTT 720 
TCTAAATCTT AAAAAAAAAA AAAAAAGAAA GAAAGAAAAG AAAAGAAAAA AAGTCAGCCG 780 
50 GGCCTGGTGG CrCATOCCTC TAATCCCAGC ACITrGGGAC CAAGGTOGAC AGATCACGAG 840 
GTCAGGAGAT GGAGACCATC CCGGCCAACA TGGAGAAACC CTGTCTCTAC TAAAAAAAAA 900 
AAAAACTCGA GQGGQGCCCG GTACCCAATN CGCCGGCTAG TGGTCGTAAA ACAATCAAA 959 

55 



60 



(2) INFORMATION FOR SEQ ID NO: 20: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1446 base pairs 

(B) TYPE: nucleic acid 

(C) STEIANDEDNESS : doxible 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

CQGGGCAGGG CTOTCrGGCA CCGCCAGGGA GCGGGCCCAC CTGAGTCACT TTATTGGGTT 60 

CAOTCAACAC TTTCTTOCTC CCTGTTTrCT CITCTGTGGG ATGATCTCAG ATGCAGGGGC 120 

TGGnTTGGG CTTTrCCTGC TTGTGCCAAG GGCIGGACAC TGCTGGQGGG CTGGAAAGCC 180 

15 CCTCCCITCC TC?rCCTTCTG TOSCCTCCAT CCCCTCATGG GTGCTGCCAT CCTTCCTGGA 240 

GAGAGQGAGG TGAAAGCTGG TGrTGAGCCCA C?rGGGTTCCC GCCCACTCAC CCAGGAGCTG 300 

GCTGGGCCAG GACCGGGAGA GGGAGCACTG CTGCCCTCCT GGCCCTC3CTC CTTCCGCAGT 360 

TAGGGC7K3GA CCGAGCCTCG CTTTCCCCAC TGTTCTGGAG GGAAGGQC3AA GGAGGGGGTC 420 
TTCAQGCIGG AGCCAGGCTG GGGGTPGCTGG GTGGAGAGAT GAGATTTAGG QGGTGCCTCA 



20 



30 



40 



480 



25 TGGGGTGGGC AGGCCTQGGG TGAAATRAGA AAGGCCCAGA ACGTGCAGGT CTGCGGAGGG 540 



GAAGTOTCCr GAGIGAAGGA GGGGACCCCC ATCCTGGGGG ATGCTGQCaG TGAGTGAGTG 600 

AGATGGCTGA GTCAGGGTTA TGGGGAGCCT GAGGTTTTAT GGGCCTGrrGT ATCCCCTTCT 660 

CCCGGCCCCA GCCTCCCTCC CTCCTGCCCG CCTQGCCCAC AGGTCTCCCT CTGGTCCCTG 720 

TCCCTCTCGT GGnTOGGGAT GGAQCGGCAG CAAGGGGrTGT AATGGGGCTG GGrrTCTCTCT 780 

35 TCTACAGGCC ACCCCGAGGT CCICAGTGGT TGCCTGGGGA GCCGGACGGG GCTCCTGAGG 840 

GCTACAGGTT GGGIGGQCCC TCCCTGAGGG TCTGGGGrTCA GGCTTTGGCT CTGCTGCCTC 900 

TCACTCACCA AGTCACCTCC CTCTGAAAAT CCAGTCCdT CnTGGATGT CCTTGTGAGT 960 

CACTCrOGGC CTGGCTCTCG TCCCTCCTCA GCTTCTTGTT CCTGGGACAA GGGTCAAGCC 1020 

AQGATCGGCC CAGGCCTGGG ATCCCCCACC CCAGGACCCC CAGGCCCCCT CCCCTGCTGC 1080 

45 TTTGCGQGGG GCAGGGCAGA AATGGACTCC TTTTGGGrrCC CCGAGGTQGG GTCCCCTCCC 1140 

AGCCCTQCAT CCTCCGTCCC STAGACCTGC TCCCCAGAGG AGGGGCCITG ACCCACAGGA 1200 

eGTGTCglGG CGCCTGGCAC TCAGGGACCC CCAGCTGCCC CAGCCCTGGT CTCTGGCGCA 1260 

50 

TCrCTTCCCT CTTGTCCCGA AGATCTGCGC CTCTAGTGCC TTTTGAGGGG TTCCCATCAT 1320 

CCCTCCCTCA TATTGTATTG AAAATATTAT GCACACTGTT CATQCTTCTA CTAATCAATA 1380 

55 AACGCTTTAT TTAAAGCCAA AAAAAAAAAA AAAAAACTCG AGGGGGGGCC CGTACCCAAT 1440 
TOGCCA 

60 



1446 
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(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 1471 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

CAAAAAATAA TAATGATAAT TTAAAATAAA TAAGTAACTA ATAAAAAGAT TTTATATCCC 60 

AGTCTTATCA TGTTGGTTGG CAAGGCTAGA TAAAAAGATG TTAGAATGAA AGAACATATT 120 

15 

TTTAGTCATA TGTAAATGAA GGATTCTACA ATAGTCATAT ATTTTTATAT GAATGAATGT 180 

TGGCTTTCGGC TCGAGAGGTA TGTOrGTGTA AATATAAAGG TCTCACATTC AGAGTATAGC 240 

20 TCriGAAATAA TGGAACTCAT GTCTACAATT CAACATGCAT CTGTATAGTT ACATCTCATG 300 

TAAATATACA CAGACATATT TTGCAGCCAG TAATTGACAG TTAATGTCCA AAACAGGTGA 360 

TTCATAQOTA ACAGAAATTA GATAACCACC AATTTTQCCC AAGAGAAAGA CTAGAAGGAC 420 

25 

TAAAAGCAGT TGAATGTATG GTACPGACAT TGTCATAAGC AGTCTGATAA CCAGTTTATT 480 

GAAACCnGTC CATTAACAGA GAATTTAATT TTAAACCCAT AATTTCTCCT ATCCATTAAA 540 

30 ATATTATAAT TGTTAGTAGT ATGAAACCAA CAGGAAATGT TTTTTAATCA TTTAGTGAGG 600 

TGATTCATrr CTmCATOGG CAAACACTAT CCAGGAAAAG CCTTGCTTGC CTGTTTCCCA 660 

AAGAGCTCTA AGAAATAGAA TCAAGTGTAA AATGGTTCAG ACCATTCAGG ATTTCTTGTC 720 

ACICTTCTCA ACCCCGATCT TCCTGrTTATT ACTGATGTTT GAAACCCTGT CATTAGCCCC 780 

GGCCTGOTTA AAGCCCCTCA GAGTCACCTC TCATTCATAG CAATAGAATT CAACCCCAAG 840 

40 TOC?nGATGG TGTCCCCAGC ACAGCCGAGA GACCTGAITCT CTGGATTCAG TGCTTTTAGC 900 

TCTTCGACTT TACCCTAAGA TACCTTOGGG CAATATTTTT AACCAACCCA AAAGCTCTTC 960 

AGGTCATTTC TGAAGAGGAC AAGGTGAATC TTGGCTTGGA ACACCATTTr TGCGCTCTTG 1020 

CTACTCAATG AATCAGAAAG GAATTTTTTC TGAAGAQCAT TAGAAACTAA AGGAGATCTTT 1080 

AAAATAAC?rT CTTGAAGTAT GTTTTAXATT TATCTAAAAC ACTGATTTTA AAAGTTTACA 1140 

50 TTCAAATOTG TATTCAAAAG AAGTACTGAT TTGTAATTAT TATAGTTTGT GTGTATCATC 1200 

CCCTTTTAAC CGTGCCTAAC AACTGTACTT AAATTTTGTT TTCCTAGTGT AACAAATGTT 1260 

TCCCATAAGA TTTTCTAGAG CCAAATAATG GGAGTGAAAA ATTCCTTAAG TGTTATATAA 1320 

55 

GAAAATATAT TAGAAAATCA GCTTTGGATT ATACGATTTC TAAAATATAC TAATACAGAA 1380 

TCCTCAC?rAA TATGTTTTGA ATTGGATTTT TTCTCAGAAC TGTTACATAA TAAATAATAC 1440 

60 ATCAACCAGA AAAAAAAAAA AAAAAAATTN C 1^71 



35 



45 
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5 (2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1402 base pairs 

(B) TYPE: nucleic acid 
10 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SB^JENCE DESCRIPTION: SEQ ID NO: 22: 

15 AGGGACGTCT TGCCTGAGGA GATGCCCATT TCTGTCCTGG RTTACCCTCA CTGCGTGGTG 60 

CATCAGCTCC CAGAGCK3AC GGCGGAGAGT TTGGAAGCAG GTGACAGTAA CCAATTTTGC 120 

TQGAGGAACC TCTTTTCTIG TATCAATCTG CTTOGGATCT TGAACAAGCT GACAAACriGG 180 

AAGCATTCAA GGACAATGAT GCTGGTQGTG TTCAAGTCAG CCCCCATCTT GAAGCGGGCC 240 

CTAAAQCyrGA AACAAGCCAT GATGCAGCTC TATGTGCTGA AGCTGCTCAA GGTACAGACC 300 

25 AAATACTTCG GGCGGCAGTG GCGAAAGAGC AACATGAAGA CCATGTCTGC CATCTACCAG 360 

AAGC?IGCGGC ATCGGCTGAA CGACGACTGG GCATACGGCA ATGATCTTGA TGCCCOQCCT 420 

TGGGACTTCC AGGCAGAGGA {JLXJVUCCCirr CGTGCCAACA TTGAACGCTT CAACGCCCGG 480 
CGCTATGACC GGGCCCACAG CAACCCTGAC TTCCTGCCAG TQGACAACrc CCTGCAGAGT 
CTCCTGGGCC AACQGGTOGA CCTCCCTGAG GACTTTCAGA TGAACTATGA CCTCTGGTTA 
35 GAAAQGGAGG TCTTCTCCAA GCCCATTTCC TGGGAAGAGC TGCTGCAGTG AGGCTGITOG 

TTAGGGGACT GAAA3t3GAGA GAAAAGATGA TCTGAAQGTA CCTGTGGGAC TGTCCTAGTT 720 

CATTCCTCCA GTOCTCCCAT CCCCCACCAG GTGGCAGCAC AGCCCCACTG TGTCTTCCGC 780 

ACTCTCTCCr GGGCTTOXrr GAGCCCAGCT TGACCTCCCC TTGGTTCCCA GGGTCCTGCT 840 
CCGAAGCACT CATCTCTCCC TGAGATCCAT TCTTCCTTTA MTTCCCCCAM CCTCCTCTCT 
45 TQGATATGGT TOGTTTTGGC TCATTTCACA ATCAGCCCAA GGTrGGGAAA GCTGGAATGG 

GAraOGAACC CCTCCGCCGT GCATCTRAAT TTCAGGGGTC ATGCTGATGC CTCTCGAGAC 1020 

ATACAAATCC TTGCCITrcT CAGCTTGCAA AGGAGGAGAG TTTAGGATTA GGGCCAGGGC 1080 

CAGAAACrrCG GTATCTTCCrr TGTGCTCTGG GGTGGGGGTG QGGTGTTTCT GATGTTATTC 1140 

CAGCCrccrc CTACATTATA TCCAGAAGTA ATTGCGGAGG CTCCTTCAGC TGCCrCAGCA 1200 

55 dTIGATnT GGACAGGGAC AAGGTAGGAA GAGAAGCTTC CCTTAACCAG AGGGGCCATT 1260 

TTTCCrrTTG GCTTTCGAGG GCCTGTAAAT ATCTATATAT AATTCTGTGT GTATTCTGTG U20 

TCATCTTTOGG CTTTTTAATG TGATTGTGTA TTCTGnTAC ATTAAAAAGA AGCAAAAATA 1380 

60 



40 



50 



540 
600 
660 



900 
960 
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ATAAAAAAAA AAAAAAAAAA CT ^^^2 



(2) INFORMATION FOR SBQ ID NO: 23: 

(i) SEQUENCE CHARACTEEIXSTICS : 

(A) LENGTH: 1047 base pairs 
10 (B) TYPE: nucleic acid 

<C) STRANDEDNESS: double 
(D) TOPOLOGY: linear 



(xi) SE^^miCE DESCRIPTION: SBQ ID NO: 23: 

GGCACAQGGG ACTACAGGCA CCCACGACCA TACCCAGCTA ArmTGTAT TmTTGTAG 60 

AGATGGGGTT TCACGATGTC GCCCAGGCTG CTrCTTGAACT CCTGGGCTTG AGCGATCTTC 120 

20 CCATCTTTCC ATCTTGGCCT CCTAAAGTGC TQGGACTGCA QGCATGAGCC ACCATGCCCA 180 

GCCAAGATTC TTATTGATTA CCATGTTGCT TCAAGAAGCC AAGCCAGTTT CCAATATTCC 240 

CCATTIGCrG GACTCTIQC3T ACTTTGGGTA GAAGCAACTG GTAAATTGTT AATTGGAACA 300 

NrrGCTGC?rc TAGATAACCA OSrrATGGCCA AACCTAGAGC ATCTAGGCTC ACAATTACTA 360 

TCCTGACTPG ATAACAAGTG TTCTGATATT AACCTGAAAA TGGGAATAAT GCCAAATCTG 420 

30 TGTAACTTAA CATCTATATA CACAGTQOGG AGAACTGAAG TTATTAAACC TQGAATCTCT 480 

CTTGATCAAGG CTAACAGTAG TTATCTAAGA AGCAAAGGAC CTACAATTCT TAGACTTOGA 540 

GICATATTCT TTAAGGACGT GTTCTGAAAC TATATCAAGC ATCTGGTTTC CACGTATTTC 600 

TCCCTCAGAA ATTATOAAGT ACAAGTAAAA ATGAAGGTAC AGGGTAAGAC ACATGCTGCT 660 

TTCTTGCTCT TGAGrTCGAGA CAGTTTTCCA GCCATCTTAA CCCCTTWACA CAAAACAATT 720 

40 TGICTTTTAT AGCAAATAAG TGACTCAACA TAATETCAAT ATGATCTCTTA TCCACCAGTA 780 

cTrrccrrrc agcttctagt cccateaartg gtttgtgaag tcatcggtta cattagccaa 

GATAGGCCTA GACTTGAAGT CEAGAATGTT TTTCCCACTA TATQCCAAAG TAGAATSTOG 

GTAItnCAGG GTCATTTTTG TlCTrCAATT TCCCACdCT ACAGrrPGrTTA TGATTCACTT 960 

TCCXTATOTG TCTAATAAAT CTTGTTCCAT GAAATGATCA AAAAAAAAAA AAAAAAAACT 1020 

50 CGAGGQGQQG CCCGGTACCC AAATCGC ^047 



840 
900 



55 (2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTTICS : 

(A) LENGTH: 990 base pairs 

(B) TYPE: nucleic acid 
60 (C) STRANDEDNESS : double 
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(D) TOPOIjOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
TIGGAAAGGG TCTAGCTCTT TCTCATTCAC CAACTATATT AGAAGCACTT GAGGGAAATT 
TACCACTCCA AATCCAAAGC AATQAACAGT CTTTTCrGGA TGATTTTATT GCCTGTGTCC 
CAGGAICAAG TGCJIGGAAGG CTTGCAAGGT GGCTTCAGCC AGATTCATAT GCGGATCCTC 
AGAAAACATC TTIGATCCTC GAATAAGGAT GATATTOCnT GrrOGTTGGCX: TACCACCATA 
ACTGTTCAAA CAAAAGACCA GTATGGGGAT GTOCTrACATG TTCCCAATAT GAAGGTAATT 
ATAACTOGAT TAAATTAGCA GACATCTATA TACTGGCTGC AATGACTGAT AAAATTTTAG 
AAATGCCAAG TOCTCAGRGT CCATTPGTTC TACXCTCTTT ATATAAAGQG TGATGCTGAA 
AGITTOrrrA AATCACTTGT TTATATTAAT TAGTCCCCAA GTGTCCAAGr TACACCTGTT 

TrmTCTCA G - m tj ri crr tacattttgc tacctgttac ggggactcaa aggagggata 

AGAAACTA1C CATCTAAAGA GTGCTAGACA CATACAGTPGA AGCCCCTCAA TATGTATTGA 
1TCAATAAAT GCATGAAAGA ATACATrrTT AAATmCTG TATA&TTTTG AAAGACTCAA 

GTACGrrcTG rorrrGGTAT tactgaaacc acattttaaa aataacactc attaagttag 
aaatatatca gtttagattg taaaagaatg aggaattgaa atagrttgtat accatattga 
tgaatataga gtttttagga tacctcttac ctgaaatatt aataataatg tttncagagc 
atattataca taattattrg tgatttaatc tcrttaatatg aataoxnxrat ttaaaacttt 
tatttctgaa aaaattatat tgaataaaat tttatatagg cagtccxxiag ccctttcctc 
cttcaaagtt cttcttataga gtcmtggtt 

(2) information for seq id no: 25: 

<i) sequence characteristics: 

(A) LENGTH: 1208 base pairs 

(B) TYPE: nucleic acid 

(C) STOANDEENESS: double 

(D) TOPOIjOGy: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
TAATCGCTAC TATAGGGAAA GCTGGrTCGCT GCAGGTACCG GTCCGGAATT CCGGGTOGAC 
CCACGCCTCC GAGCGAAATG GOGCCTCCOG CCCCCGGCCC GGCCTCCGGC OGCTCCGGGG 
AGC?r AGACQA GCTGTTCGAC GTAAAGAACG CCTTCTACAT CGGCAGCTAC CAGCAGTGCA 
TAAACGAGGC GCASGGGTGA AGCTRTCAAG CCCAGAGAGA GACGTGGAGA GGGACGTCTT 
CCTCTATAGA GCGTACCTGG CGCAGAQGAA GTTOGGTGTG GTCCTGGATG AGATCAAGCC 
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(2) INFORMATION FC3R SEQ ID NO: 26: 



CrcCTCGGCC CCIGAGCTCC AGGCXCTGCG CATGrprTGCT GACTACCTCG CXXACGAGAG 360" 

TCGGAGGGAC AGCATCGTGG ODGAGCTGGA CXX5AGAGATG AGCAGGAGCK TGGACGTGAC 420 

^ CAACACCACC TPCCTCCTCA TOGCCGCXTTC CATCTATCTC CACGACCAGA ACCCXX5ATGC 480 

CGCCCTGCOT GCGCTCCACC AGGQGGACAG CCTGGAGTGC ACAGCCATGA CAGTGCAGAT 540 

10 CCTGCTCAAG CTGGACCXXX: TGGACCTCGC CCGGAAGGAG CTGAAGAGAA TGCAGGACCT 600 

GGACGAGGAT GCCACCCTCA CCCAGCTCGC CACTGCCTGG GTCAGCCTGG CCACGGGTQG 660 

TGAGAAGCTG CAGGATGCCT ACTACATCTT CCAGGAGATG GCTGACAAGT GCTCGCCCAC 720 

15 

CCTGCTCCTG CTCAATOGGC AGGCGGCCTG CCACATGGCC CAGGGCCGCT GGGAGGCCGC 
TCAGGGCCTC CTCCAGGAQG CGCTAGACAA GGATAGTQGC TACCCRGAGA CGCTGGTCAA 

20 ccrcATCCTC crorcccAGC acctkggcaa gccccxttgag gtgacaaacc gatacctgtc 



780 
840 
900 



ccagctgaag gatgcccaca ggtcccatcc cttcatcaag GAGTACCAGG CCAAGGAGAA 960 



1020 
1080 



CGACTTTCAC AGGCTCGTGC TACAGTACGC TCCCAGCXXTT GAGGCTGGCC CAGAGCTGTC 
AGGACCATCA AGCCAGGACA GAGGCCAGGA GCCAGCCCTG CAGOXTCCC CACCCGGCAT 
CCACCTGCAT CXXTCTOGQG CAGGAGCCCA CCCCCAGCAC CCCCATCTGT TAATAAATAT 1140 
30 CTCAACTCCA RQGTCITCCA CXHXSAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA UOO 
AAAAAAAA 



1208 



(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 1922 base paixs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOEOLOGY: linear 

45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

GIGCTCCGCT ACTCAGCAGC GCCATGGAGG ACTCTGAAGC ACTGGGCTTC GAACACATGG 60 

GCCTCGATCC CCGGCTCCTT CAGGCTGTPCA CCGATCTOGG CTGGTCGCGA CCTACGCTGA 120 

TCCAGGAGAA GGCCATCCCA CTGGCCCTAG AAGOGAAGGA CCTCCTGGCT OGQGCCCGCA 180 

CGGGCTCCGG GAAGACGGCC GCTTATGCTA TTCCGATGCT GCAGCTGTTG CTCCATAGGA 240 

55 AGGCGACAGG TCCGOTGGTA GAACAGGCAG TGAGAGGCCT TGrrPCTPGTT CCTACCAAGG 300 

AGCTCGCACG GCAAGCACAG TCCATGATTC AGCAGCTGGC TACCTACTGT GCTCGGGATG 360 

TCCGAC7K3GC CAATOICTCA GCTGCTGAAG ACTCAGTCTC TCAGAGAGCT GTGCTGATGG 420 

60 
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AGAAGCCAGA TOTGGTAGTA GQGACCCCAT CTCGCATATT AAGCXZACTTG CAGCAAGACA 
GCCTGAAACT TOGTCACTCC CTGGAGCTTT TOGrrGGTrGGA CGAAGCTGAC CTTCTTTTTT 
COTTCGCTT TGAAGAAGAG CTCAAGAGTC TCCTCTGTCA CTTGCCXXX3G ATTTACCAGG 
CTTTTCTCAT CTCAGCTACT TTTAACGAGG ACGTACAAGC ACTCAAGGAG CTGATATTAC 
A-EAACCCGGT TACCCTTAAG TTACAGGAGT CCCAGCTGCC TGGGCCAGAC CAdTACAGC 

AC?rrrCAGGT OGTCrcrPGAG actgaggaag acaaattcct cctgctgtat gccctgctca 

AGCTCTCATT GATTCGGGGC AAGrTCTCTGC TCTTTGrCAA CACTCTAGAA CGGAGTTACC 
GGCTACGCCT CTTCnGGAA CAGTTCAGCA TCCCCACCTG TGTGCTCAAT QGAGAGCTTC 
CACTGCGCTC CAGCTTGCXAC ATCATCTCAC AGTTCAACCA AGGCTTCTAC GACTGTGrrCA 
TAGCAACTGA TCCTGAAGTC CTGGGGGCCC CAGTCAAGGG CAAGCGTCGG GGCCX5AQGGC 
CNAAAGGGGA CAAGGCCTCT GATCCGGAAG CAGGTGTGGC CCGGGGCATA GACTTCCACC 
ATOTGrCTGC TCTGCTCAAC TTTGATCTTC CCCCAACCXX: TGAQGCCTAC ATCCATCGAG 
CTGGCftGGAC AGCACGCGCT AACAACCCAG GCATAGTCTT AACCmGTG CTTCCCACGG 
AGCACTTTCCA CTTAGGCAAG ATTGAGGAGC TTCTCAGTGG AGAGAACAQG QGCCCCATTC 
TGCTCCCCTA CCAOTTCCGG ATOGAGGAGA TOSAGGGCTT CCGCTATCGC TGCAGGGATG 
CCATGCX3CTC AGIGACTAAG CAGOCCATTC QGGAGGCAAG ATTGAAGGAG ATCAAOGAAG 
AGCTTCTCCA TTCTGAGAAG CTTAAGACAT ACTTTGAAGA CAACCCTAGG GACCTCCAGC 
TOCIGCGGCA TCACCTACCT TTGCACCCCG CAGrTGGrTGAA GCCCCACCTG GGCCATGrTPC 
CT6ACTACCT GGTTCCTCCT GCTCTCCGTG GCCTQGTRCG CCCTCACAAG AAGCGGAAGA 
ftGCTCTCTTC CTCTTGTRGG AAGQCCAAGA GAGCAAAGTC CCAGAACCCA CTGCGCAGCT 
TCAAGCACAA AGGAAAGAAA TTCAGAOXA CAGCCAAGCC CTCCTGAGGrr TGTrGGGCCT 
CTCTGGAGCT GAGCACATTG TQGAGCACAG QCTCACACCC TTCGTCGACA GGCGAGGCTC 
TGCTTGCTTAC TGCACAGCCT GAACAGACAG TTCTQtSGGCX: GGCAGTGCTG GGCXTTTTAG 
CTXrTTGGCA CITCCAAGCT GGCATCTTGC (XCTTGflCAA CAGAATAAAA ATTTTAGCTG 
CCCCAAAAAA AAAAAAAAAA AAAAAAACTC GRGOGGGGGC* CCGTACCCAA TTCGCCCTAT 
AA 



480 
540 
600 
660 
- 720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1922 



55 



60 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1951 base pairs 

(B) TVPE: nucleic acid 
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(C) STOANDECNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

^ TCGTCCCCAG AGCGGCSCTGA GCCCCAQGCG SAGGGTQGCG GGGGAGCCTG GGGGAGCCGC 60 

CGCCACCrcC ACC3GGCCTCT CTCAGCTCGG ACACCAGCGC CCTC3TCCTAT GACTCTGTCA 120 

10 ACTACACGCT GCTTOGTAGAT GAGCATGCAC AGCTQGAGCT GGTGAGCCTG CGCCGTGCIT 180 

CGGAGACTAC ACJTCACGAGA GTGACTCPGC CACCGTCTAT C5ACAACTGTG CCTC03TCTC 240 

CTCGCCCTAT GAGTCGGCCA TCGGAGAGGA ATATGAGGAG GCCCCGCGGC CCCAGCCCCC 300 

15 

TOCdGCCTC TCCGAGGAAC TCCACGCCTG ATGAACCCGA CGTCCATTTC TCCAAGAAAT 360 

TCCTGAACCyr YTTCATXSAGT GGCCGCTCCC GCTCCTCCAG TGCTGAGTTCC TTCGGGGTOT 420 
20 TCTCCTGCAT CATCAACGGG GAGGAGCAGG AGCAGACCCA CCGQGCCATA TTCAGGTTPG 



25 



45 



AGQCIGAAGA CTACrGGTAC GAGGCCTACA ACATQCGCAC TGGTGCCCGG GGTOrCTTTC 
CTGCCTATTA CGCCATCGAG GTCACCAAGG AGCCCGAGCA CATGGCAGCC CTGGCCAAAA 



480 



TGCCTCGACA CGAAGACGAA CTTGAGCTGG AAGTQGATGA CCCTCTGCTA GTrGGAGCTCC 540 



600 
660 



ACACTGACTG GGrTGGACCAG TTCCGQGTGA ACyTTCCTGGG CTCAGTrCCAG GTTCCCTATC 720 



780 
840 
900 



30 ACAAGQGCAA TCACGTCCTC TGTOCTGCTA TGCAAAAGAT TGCCACCACC CGCCGGCTCA 
CCGTGCACIT TAACCCGCCC TCCAGCTGTG TCCTGGAGAT CAGCGTGCGG GGTGTGAAGA 
TAGGCGICAA GGCCGATCAC TCCCAGGAGG CCAAGGQGAA TAAATGTAGC CACTTTTTCC 
ACHTAAAAAA CATCTCTrTC TGCGGATATC ATCCAAAGAA CAACAAGTAC TTTGGGTTCA 960 
TCACCAAGCA CCCCGCCGAC CACCGCTTTTG CCTGCCACGT CTTTGTGTCT GAAGACTCCA 1020 

40 CCAAAGCCCT GGCAGAGTCC GTGGGGAGAG CATTCCAGCA GTTCTACAAG CAGrTTTGfrGG 



1080 



AGTACACCIG CCCCACAGAA GATATCTACC TGGAGTAGCT GTGCAGCCCC GCCCTCTGCG 1140 



1200 
1260 



TCCCCCAGCC CTCAGGCCAG TGCCAQGACA GCTGGCTGCT GACAGGATGT GGCACTGCTT 
GAGGAGGGGC ACCTGCCACC GCCAGAGGAC AAGGAAGTGG GGCGCTGGCC CAGGGTAGQG 

GAGGGTQGGG CAATCGGGAG AGGCAAATGC AGTrTATTGT AATATATQGG ATTAGATTCA 1320 

50 TCTATOGAGG GCAGAGTGGG CTGCCTGGGG ATTGGGAGGG ACAGGGCTTG GGGAGCAGGT 1380 

CTCTGGCAGA GAAGGATOTC CGTTCCAGGA GCACACGGCC CTGCCCCATC CTGGGCCTTA 1440 

CCTCCCCTGC CAGGGCrcXSG GCGCTGTGGC TCCTGCCTTC ATGAAGCCCG TGTCCTGCCT 1500 

TGATGAAGCC TOIGCCACCT GCAAGTGCCC GCCCTGCCCC TGCCCCAACC CCCACCGAAG 1560 

AGCCCTGAGC TCAGGCTGAG CCCAGCCACC TCCCAAGGAC TTTCCAGTGA GGAAATGC3CA 1620 

60 ACACCT3GAG OTGAACTICCC TGnCTCAGC TCCGTCATCT GCGGGGCTTC TGGGTGGCTC 1680 
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CTCCCACTGA CCTCACCGGC ATGCTGGCCT GTOGCAGGCC TACGACCTCA GGCGGGGAGG 
AGGAGCTGCC GCAAGGCCCT GTCCCAGCAG AAGAGGGAGG CTTCCTGACT GACACAGGCC 
AGCCCCATCT TQGTCCTGTC ACXXTTGGCCC CAACTATTAA AGTGCCATTT CCTGTCAAAA 
AAAAAAAAAA AAAATCGGGG GGGGCCCGGA ANCCAATTTC CCCCAAAAAG GGGGGTITATA 
AAAATTCCCN GGCNGTGTrT TTAAAAATTC G 



1740 
1800 
1860 
1920 
1951 



30 



15 (2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3989 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS : double 

• (D) TOPOLOGy: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

25 GGCACAGGCC GCAGQGNACC TATQGGCGCA TATAGGTrCT AATGAAACTG TAGTCTCAGT 60 

TGGAAGCCTA GACATOAAAT GGGTCAGTGA OCAAGGCTCT ATTCCTAGTC TCCAGCCATG 120 

CCTGTCGAAC CTCARCCCRC TCTCAGCACA TTGGACCCAG GCAGATGYAA AAAATTCACA 180 

GAACTATGAT TTCGACTCAA GGGrTTTGTAG ATTTCCTCCT TCATTCTAAT TTCAGTCrTCT 240 

AAAATTCTTG CATCCRTGAA CGftCCTCXSGC ATTTGATGAG ACAGGGCYGA ATACTGCAGT 300 

35 TTTCCTCCTA GAAATCATCT GGGGCATTTT CTTTGAACTG ATGGGAACAA T.-J«3GCATAA 360 

CKTmGCAC AAACTTOGGA TAAETGATTT TGGGATAACG ATCTACCAGA ATGGGGATAT 420 

TTCACCCTTG CyTTCTGAGAT GCAAACCAAA GAATATCATG ACCAGCTTTC AGGCCTCCTG 480 

40 

AAC?rATATCT CTCACATTGT CCTCrTTCTCA TGCTGAGGAG CCTGAGATCC CTGTGTGGGG 540 

ATTAGACACyr GGACTGTTAT GGGTGTAGGT GAATTGGCTT ATTTTGTCTG TCCCTOTCTG 600 

45 AATOTOTGC AGGAAVTAAA AAGGACCAAG AAGAGGAAGA AGACCAAGGC CCACCATGCC 660 

CCAGGCTCAG CAGGGAGCTG CTOGAGGTAG TAGAGCCTCA AGTCTTGCAG GACTCACTOG 720 

ATAGATGTTA TTCAACTCCT TCCAGTTGTC TTGAACAGCC TGACTCCTGC CAGCCCTATG 780 

50 

GAAGTTCCTT TTATCCATTG GAQGAAAAAC ATGTTGGCTT TTCTCTTGAC GTGGGAGAAA 840 

TTGAAAAGAA GGQGAAGGGG AAGAAAAGAA GGGGAAGAAG ATCAAAGAAG GAAAGAAGAA 900 

55 GGGGAAGAAA AGAAGQGGAA GAflGATCAAA ACCCACCATG CCCCAGGCTC AGCAGOGftGC 960 

TGCTGGATCA GAAAGRGCCT GAAGTCTTGC AGGACTCACT GGATAGATGT TATTCAACTC 1020 
CTTCACTICT GITCAACTGT GrPGACTCATG CCAGCCCTAC AGAAGTGCCT TTTATGTATT 

60 
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GGAGCAACAG CATCTTOXT TGGCTOITGA CATGGATGAA ATTGAAAAGT ACCAAGAAGT 
GGAAGAAGAC CAftGACCCAT CATGCXXCAG GCTCAGCAGG GAGCTGCTQG ATGAGAAAGA 
GCCTGAACTC TTCCAGGACT CACTQGATAG ATCTTATrCG ACTCCTTCAG GTTATCTTGA 
ACTGCCTCAC TTAGGCCAGC CXTTACAGCAG TGCKGTTTAC TCATTGGAGG AMCAKTACCT 
TGC3CTTKKCT CTTCACGTGG ASAAATTGAA AAGAAC3GGGA AGGGGAARAA AAGAAGC3GGA 
AGAAGATCAA AGAAGGAAAG AAGAAGGGGA AGAAAAGAAG QGGAAGAflGA TCAAAACCCA 
CCATCCCCCA GGCTCAGCAG GGAGCTGCTG GATGAGAAAG GGCCTGAACyr CTTGCAGGAC 
TCACTGGATA GATOTTATTC AACTCCTTCA GGTTGTCTTG AACTGACTGA CTCATGCCAG 
CCCTACAGAA CyTGCCmTA YRTATTGGAG CAACAGYGTG TTGCXnTCGC TGTTGACATG 
GATCAAATTC AAAflCTACCA AGAAGrPQGAA GAAGACCAAG ACCCATCATG CCCCAGC3CTC 
AGCAGGGAGC TCCTGGAIGA GAAAGAGCCT GAAC3TCTTGC AGGACTCACT QGATAGATCT 
TATTCGACTC CTTCAGCnTA TCTTGAACTG CCTGACTTAG GCCAGCCCTA CAGCAGTGCT 
GmACTCAT TGGAGGAACA GTACCTTOGC TTQGCTCTTG ACGTGGACflG AATTAAAAAG 
GACCAAGAAG AGGAAGAAGA CCAAGGCXTA CCATGCCCCA GGCTCAGCAG GGAGCTGCTG 
GAGOTACTAG AGCCTGAAGT CTTGCAGGAC TCACTGGATA GATGnTATTC AACTCCTTCC 
AGTTCTCrTG AACAGCCTGA CTCCTGCCAG CCCTATGGAA GTTCCTTTTA TGCATTGGAG 
GAAAAACATG TTGGCTTTTC TCTTGACGTG GGAGAAATTG AAAAGAAGGG GAAGGGGAAG 
AAAAGAAGGG GAAGAAGATC AAMGAAGRAA AGAAGAAGGG GAAGAAAAGA AGGGGAAGAA 
GATCAAAACC CACCATGCCC CAGGCTCAAC GGCGTGCTGA TGGAAGTGGA AGAGCSTGAA 
GTCTTACAGG ACTCACTQGA TAGATGTTAT TCGACTCCGT CAATGTACTT TGAACTACCT 
GACTCATTCC AGCACTACAG AAGTGTGTTT TACTCATTTG AGGAACAGCA CATCAGCTTC 
GCCCTTTACG TGGACAATAG GTTTTrTACT TTGACGGTGA CAAGTCTCCA CCTGGTGTTC 
CAGATCQGAG TOWMTCCC ACAATAAGCA GCCCTTASTA AKCCGAGAGA TGTCATTCCT 
GCAGGCAGGA CCIATAGGCA MGTGAAGATT TGAATGAAAG TACAGTrCCA TTTGGAAGCC 
CAGACATAGG ATGGGrTCAGT GGGCATGGCT CTATTCCTAT TCTCAAACCA TGCCAGTGGC 
AACCTCTTCCT CAGTCTCAAG ACAATGGACC CACGTTAGGT GTGACACGTT CACATAACTG 
TGCAGCACAT GCCGGGAGTG ATCAGTCRGA CATTTTAATT TGAACCACGT ATCTCTGGGT 
AGCTACAAAA TICCTCAGGG ATTTCATTTT GCAGGCATGT CTCTGAGCTT CTATACCTGC 
TCAAGGTCAK TGrCATCTTT GTGTOTAGCT CATCCAAAGG TGTTACCCTG GrTTCAATGA 
ACCTAACCTC ATTCTTTCTC TCTTCAGTGT TGGCTTGrrrT TAGCTGATCC ATCTGTAACA 



1140 

1200 

1260 

1320 

1380 

1440 

1500 
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1860 
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CAGGAGGGAT CCTTGGCTGA GGATTGTATT TCAGAACCAC CAACTGCTCT TGACAATTGT 
TAACCCGCTA GRCTCCTTTG GTTAGAGAAG CXIACAGTCCT TCAGCCTCCA ATTQGrrGTCA 
CTACTTAGGA AGACCACAGC TAGA3X3GACA AACAGCATPG GGAGGCCTTA GCCCTGCTCC 
TCTCRATTCC ATCTTGTAGA GAACAGGAGT CAGGAGCCGC TGGCAGGAGA CAGCATGTCA 
CCCAGGACTC TGCXXXTTGCA GAATATGAAC AAYGCCATCT TCTTGCAGAA AACGCTTAGC 
CTGAOTITCA TAGGAGCrTAA TCACCAGACA ACTQCAGAAT GTRGARCACT GAGCAGGACA 
GCTGACCTCT CICCTTCACA TAGTCCATRT CACCACAAAT CACACAACAA AAAGGAGARG 
AGATATTTTG GCTTCAAAAA AAGTAAAAAG ATAATGTAGC TGCATTTCTT TAGTTAITTr 
GARCCCCAAA TATTTCCTCA TCTTTTTGTT GTTGTCATKG ATGGrTGGTCA CATOGACITG 
TTTATAGAGG ACAGGTCAGC TGTCTGGCTC AGrTGATCTAC ATTCTGAAGT TGTCTGAAAA 

TCnCTTCArrG attaaattca gcctaaacgt tttgccggga acactgcaga gacaatgctg 

TGAOTTTCCA ACCTYAGCCC ATCTGCGGGC AGAGAAGGTC TAGTTTGTCC ATCASCATTA 
TCATCATATC AQGACTGGTT ACITGGTrAA GGAGGQGTCT AGGAGATCTG TCCCTTTTAG 
AGACACCTTA CTTATAATGA AGTATTTGGG AGGGrTGGTTT TCAAAATTAG AAATGTCCTG 
TATTCCRATC ATCATCCTOT AAACATTTTA TCOTTTATTA ATCATCCCTG CCTGTGTCTA 
TTATTATATT CATATCTCTA CXXTTOGAAAC TTTCTGCCTC AATGmTACT GrTGCXTTTTGrr 
TrrrGCTAGT CJll^lVntil^TT GAAAAAAAAA ACATTCTCTG CCrGAGTTTT AATTnTGrrC 
CAAAGITATT TTAATCTATA CAATTAAAAG CTTTTGCCTA TCAAAAAAAA AAAAAAAAAA 
AAAAAAAAAA AAAAAGCGGA CGCGTCGCC 
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(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3735 base paxrs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : cbuble 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
CTGCICTTCG CTCGCTGGGC TCCGCAGCAG GCTTGGCCAG CSGCTGACGG GTCGGCQGGC 
GGCTTTCTCT GAACAGGCAC GCAQCTGCAG ATTTTATrCT GGTAGTGCAN CCCTCTCAAA 
GCJTTCAAGGA ACTGATGTAA CAGOGATTGA AGAAGTAGTA ATTCCAAAAA AGAAAACTTG 
GGATAAAGTA GCCGTTCTTC AGGCACTTGC ATCCACAGTA AACAGGGATA CCACAGCTGT 
GCCTTATCTC TTTCAAGATC ATCCTTACCT TATGCCAGCA TCATCTTTGG AATCTCGTTC 



60 
120 
180 
240 
300 
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ATTTTTACTG GCAAAGAAAT CCGGGGftGAA TGTCGCCAAG TrTATTATrA ATTCATACCX: 360 

CAAATATTTT CAGAAGGACA TAGCTGAACC TCATATACCG TGTTTAATGC CTGAGTACTT 420 

^ TGAACCTCAG ATCAAAGACA TAAGTGAAGC OGCCCTGAAG GAACGAATTG AGCTCAGAAA 480 

ACnCAAAGCC TCKJTCGACA TOTTTGATCA GCTTTTGCAA GCAGGAACCA CnCTGTCTCT 540 

10 TGAAACAACA AATAGTCTCT TGGATTIWrr GTGTrACTAT GGTGACCAGG AGCCCTCAAC 600 

TGATTACCAT TTTCAACAAA CTGGACAGTC AGAAGCATTG GAAGAGGAAA ATGATGAGAC 660 

ATCTAGGAGG AAAGCTGGTC ATCAGTTTGG AGTTACATGG CGAGCAAAAA ACAACGCTGA 720 

GAGAATCTTT TCTCTAATGC CflGAGAAAAA TGAACATTCC TATTGCACAA TGATCCGAQG 780 

AATCCTGAAG CACCGAGCTT ATGAGCAGGC ATTAAACTTG TACACTGAGT TACTAAACAA 840 
20 CAGACTCCAT GCTGAICTAT ACACATTTAA TGCATTGATT GAAGCAACAG TATGTTGCXSAT 
AAATGAGAAA TTTCAGGAAA AATGGAGTAA AATACTGGAG CTGCTAAGAC ACATGGTTGC 



25 



55 



900 
960 



ACAGAAGCTG AAACCAAATC TTCAGACTTT TAATACCATT CTGAAATGTC TCCGAAGA3T 1020 



1080 



TCATCICTIT GCAAGATCGC CAGCCTTACA GGTTTTACGT GAAATGAAAG CCATTQGAAT 

AGAACCCTCG CTTQCAACAT ATCACCATAT TATTCGCCTG TTTGATCAAC CTGGAGACCX: 1140 

30 TTTAAAGAGA TCATCCrTCA TCATTTATGA TATAATGAAT GAATTAATOG GAAAGAGA2T 1200 

TTCTCCAAAG GACCCGGATC ATGATAAGTT TTTTCAGTCA GCCATGAGCA TATGCTCATC 1260 

TCTCAGAGAT CTAGAACTTG CCTACCAAGT ACATCGCXTTT TTAAAAACCG GAGACAACTG 1320 

35 

GAAATICATT GGACdGATC AACATCGTAA TTTCTArrAT TCCAAGrTTCT TCGATTTGAT 1380 

nCTCTAATC GAACAAATTG ATGTTACCTT GAAOIGGTAT GAGGACCTGA TACCTTCAGC 1440 

40 CTACTTTCCC CACICCCAAA CAATGATACA TCTTCTCCAA GCATTQGATG TGGCCAATOS 1500 

GCTAGAAGTC ATTCCTAAAA TTTOGAAAGA TAGTAAAGAA TATGGTCATA CTTTCCGCAG 1560 

TGACCTGAGA GAAGAGATCC TGATGCTCAT QGCAAGGGAC AAGCACCCAC CAGAGCTTCA 1620 

45 

GcyroGCATTT gctgacktk; CTGCPGATAT CAAATCTGCG TATGAAAGCC AACCCATCAG 1680 

ACAGACTCCT CAGGATTOGC CAGCCACCTC TCTCAACTCT A3CAGCTATCC TCTTTTTAAG 1740 
50 GGCTOGGAGA ACTCAGGAAG CCK3GAAAAT CTTGGGGCTT TTCAGGAAGC ATAATAAGAT 1800 
TCCTAGAAOT GAOTTCCTCA ATCAGCTTAT GGACAG?rGCA AAAGrTGTCTA ACAGCXCTTC 1860 
CCAGGCCATT GAA!3rA£?rAG AGCTGGCAAG TGCCITCAGC TTACCTArrT GrTGAGGGCCT 1920 
CACCCAGAGA GTAATCAGTG ATTTTGCAAT CAACCAGGAA CAAAAGGAAG CCCTAAGTAA 1980 
TCTAACTCCA TTGACCACTG ACAGTGATAC TGACAGCAGC AGTCACAGCG ACAGTCACAC 2040 
60 CAC7K5AAGGC AAATCAAAG?r GGAGATTCAG GAGCAGCAAT GGTCTCACCA TAGCTGCTGG 2100 
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AAICACACCT GAGAACTGAG ATATACCAAT ATTTAACATT GTTACAAAGA AGAAAAGATA 
CAGATTTCGT GAATTTCTTA CTGTGAGGTA CAGTCAGTAC ACAGCTGACT TATGTAGATT 
TAAGCrCCTA ATATCCTACT TAACCATCTA TTAATGCACC ATTAAAGGCT TAGCATTTAA 
GTAGCAACAT TGCGC?mTC AGACACATGG TGAGGTCCAT GGCTCTTGTC ATCAGGATAA 
GCCTOCACAC CTAGAGICTC GGrTGAGCTGA CCTCACGATG CTGTCCTCGT GCGATTGCCC 
TCTCCTGCTG CTQGACTTCT GCCTrTGTTG GCCTGATGTG CTGCTGTGAT GCTGGTCCTT 
CATCTTAGGT GTTCATOCAG TTCTAACACA GTTGGGGTTG GGTCAATAGT TTCCCAATTT 
CAGGATATTT CGATOTCAGA AATAACGCAT CTTAGGAATG ACTAAACAAG ATAATGGCAG 
TTTAGGCTGC ACAACTGGTA AAATGACTGT AGATAAATGT TGTAATTAGT GTACACGTTT 
CJTATTmCT TAATATAGCC GCTGCCATAG TTTTCTAACT TGAACAGCCA TGAATGITTC 
ATOTCTCCCT ' ITlTrrri TG TCTATAGCTG TTACCTATTT TAGTGGTTGA AATGAGAGCT 
ACTGATCACA GAAQGATCTC GAATGrrCTTC TTGACATCAT TGTGTATTGC TGCTTAATCAA 
GTTOOTAACG ACTACTTCTA GCAGCTCTTA CCACTATGAC TTAAGTOGTC CTGGAAGGCA 
CTAAGIGGAG OTTTGCAGCA TTCCTGCCTT CATGAGGGCT TCTACCACTG ACCACTTTGC 
ACCTACCTX3G CTCCCAGATT TACTTAGGTA CCCCACGAGT CGTCCACATA AGCAGCTTCA 
TCrTTACCTT GCCAGAGTTG ACAATTATGG GATACTCTAG TCTACTTATA CTTGTGTTCC 
CATCTCTCTG CCATCCTCTG AAGGCCAGGA CCCMTO^TA CATCCTTAGA AACCAAAGTA 
TGGmTTCT TTTCTCTTGG AATGTCAGGT CTTAAGGCAT •ITAATTGAGG GACAAAAAAA 
AAAAAAAGCC GATATAGTAG CTAGCTACTT AAGCATCCAT GGGrTATTGCT CCATATCAAA 
GCAGATTIGC AGGACAGAAA GAGTAAATTA GCCTTCAGTC TrGGTTTACA GCTTCCAAGG 
AGAGCCTTGG CCACCTGAAA TGrTTAACTCG GrTCCCTTCCT GTCTCTAGTT CATCAGCACC 
TOCAGATGCC TGACICTTCT TAGCCTTACT ATTCAATACA GTCCTTAGAT TCACGGTATG 
CCrCTTCCTA TCCAGGCACC TATTCTGAAT CACCATGTTG CTCTGCAGCT AGAGrTTGATA 
GGAGAAAATC CATTTGGGTA GATGGCCTAT GAATTTGTAG TAGACTTTCA AAATGAGTGA 
•mCTTAGCT TGCTACTTTT AAGTTTGTGG TACAGATCCT CCAAACCCAT ACTCTGAGCA 
A-rrAACTQCC TTCAACATAG AGAAAATTAA GGCXniCACAG GATGAGTPCTC CATTCTCTGT 
AAATCCTTAT TTTATCATAG TCTTTAGCCN CTACTATGAG TAAAATGTTC TCTTCNQCCX; 
GGTOTGGrPGA CTCAC 
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(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTTH: 1667 base psdrs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

TACTAATTCA TTTAACTCCT CTTACATGAG TAGCGACAAT GAGTCAGATA TCGAAGATGA 60 

AGACTTAAAG TTAGAGCTOC GACGACTACG AGATAAACAT CTCAAAGAGA TTCAGGACCT 120 

15 GCAGACTCGC CAGAAGCATG AAATTGAATC TTTGTATACC AAACTGGGCA AGGTGCCCCC 180 

TCCTOrrATT ATTCCCCCAG CTGCTCCCCT TTCA0C3GAGA AGACGACGAC CCACTAAAAG 240 

CAAAGGCAGC AAATCTAC?rC GAAGCAGTTC CTTGQGGAAT AAAAGCCCCC AGCTITCAGG 300 

TAACCTCTCT GGTCAGAOrG CAGCTTCAGT CTTQCACCCC CAGCAGACCC TCCACCCTCC 360 

TQGCAACATC CCAGAGTCCG GGCAGAATCA GCTGrTTACAG CCCCTTAAGC CATCTCCCTC 420 
25 CAfflGACAAC CKrrATTCAG Cr^^ . 

TQCrCCAGCT CAAQGAACCA GCAGCACAAA CACTGTTGGG GCAACAGTGA ACAGCCAAGC 540 

CXXXXMGCT CAGCCTCCTC CCATCAOGTC CAGCAGGAAG GGCACATTCA CAGATGACTT 600 

30 

GCACAA£?ITO CTAGACAATT GGGCCCGAGA TGCCATGAAT CTCTCAGQCA GGAGAGGAAG 660 

CAAAGGGCAC AIGAATIATC AGGGCCCTGG AATGC3CAAGG AAGTTCTCTC CACCTQGGCA 720 

35 ACTCTTGCATC TCCATCACCT CGAACCTGGG TGGCTCTGCC CCCATCTCTG CAGCATCAGC 780 
TACCICrCTA GGICACTTCA CCAACTCTAT GTGCCCCCCA CAGCAGTATG GCTTTCCAQC 
TACCCCAITT GGCGCTCAAT GGAGTGGGAC GGGTGGOCCA GCACCACAGC CACTTGGCCA 

CTTCCAACCT GTOGGAACTC CCTCCTTGCA GAATTTCAAC ATCAGCAATT TGCAGAAATC 960 

CATCAGCAAC CCCCCAGGCT CCAACCTGCG GACCACTTAG ACCTAGAGAC ATTAACTGAA 1020 

45 TAGATCTOCSG GGCAGGAGAT GGAATGCTGA OGGGGTGGCT GGGGGTGGGA ACTAGCCEAT 1080 

ATACTAACTA CTAOTGCIGC ATTTAACTG6 TTATTTCrrG CCAGAGQGGA ATGirTTEAA 1140 

TACTCCA3TC ftOCCCTCAGA ATCGAGAGTC TCCCCCGCTC CAGTTATTGG AATQGGAGAG 1200 

GAAGGAAAGA ACAGCITITT TCTCAAGGGG CAGCTTCAGA CCATGCTTTC CrGTTTATCT 1260 

ATACTCACTA ATOAGGATCA GGGCTAGGAA ACJrcTTGTTC ATAAGGAAGC TGGAGAACTC 1320 

55 AATGTAAAAT CAAACCCATC TOTAATTTCG AGTOGGTGGA GCTCTTGCTT TrGGTACATG 1380 

CXXnXSAATCC CTCACTCCCT CAAGAATCCG AACCACAGGA CAAAAACCAC CTACIGGGCT 1440 

cierccTAcc croccciccr cccttttttt tacccctctc TTTTTTArrr tttctttgct isoo 
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CTTTAGAACC CAC7IGAAAAA TACCAGGGTA CTGGGGTC5CA ACTCTTTCIT ATGATAGGTC 1560 
ATTACmSCTT TAAGCAAAAG ATATTAGCAG OTTGACTGC AGCATTAGCA ATTAGGRAAA 1620 
5 AAAAAAANWA AAAACTCGAG GGC3GGGCCCG GTTACCCAAT TCGCCCT 166*7 

10 (2) INFOPMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHAEWiCTERISTXCS : 

(A) LHKTTH: 1408 base pairs 

(B) TYPE: nucleic acid 
J 5 (C) STRANDEDNESS: double 

{D) TOPOUOGy: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
20 ATTACACACC TXSAGCACTCT GCCTC3GCAAG ACCTGTCTTA ATAGATTAGA GAACCACTGA 
TAGATCGTCA GCOTCTOrA GCAGTCAGAA CCCTACATTT CAAATGTQGA TAGCACCITT 
GCGGGGAAAC ATCACTTQGC ACATCTC3CAT TCTTTnTGA CACAGGGTCT CACTCICTTG 
CCCAGGCTAG AC7IX3CATCGC ACGATCTTAG CrCACTCCAA CCTCCACCTC CCAAGTTCAA 
GCGATICrrC TCCCTCAGCC TCCTGAGCAG CTGGGftTCAC AGACATGCGC TACCATGCCC 
AGCTAATTTT TTCTATrnT TGTKPGTTTG TnTrcnTTK TAAGTAGAGA OQCSGCTTTCA 
CCACC?rTCGS CAGGCAGC?rC TCGAACTCCT GMKnCAGC?! GATCCACCCA CATCTGCGTT 
CCAATATCTT TCTCAACATA ATCATAGCCG TAATTAATAT TTTCCAGTAC ATTTTTATGC 480 
CrTTACACAC GAGACTGGTA GACAGACACA AACCCAGATC TGTCTGACTC CAAAGCCCC3T 
TICTCATCAT TCCrTTTACG GTATCCTATA GTOGTATCCT TTACAGAAAG ACAGCTTITA 
CCCAACAAAG ACTTAACTTC CCAGGATGCC AGAftGGACAA W3CGGGATTG CTTTTAAGRA 
GRAACTTATC AAGRMCITAT TTTATAAATG AGATTAGATA GGGAAAGGCA ATTTATCTTT 720 
ATTAAAAACT GAAAAGGCCA GCATAGOGAA GGAGGTCCTT COGTGCTrCTT TTTCAGGGAA 
ATACTTCACT TCCTTTTATr AGAAACAGAT ACTACCTAAG GrTTTTGAGGT AGGWACAGCT 
TAAGGCATGC TAATXaCTCAT GGGTCCTTCC ATAGTCATTT TEOSTATTTTG GnWACATTT 
50 GAGCAATAGG CAGCCCTTCA CTGCTGCTGG AYTCATTCCT GCCAYTATTA CAGGTGACAG 
AGGAGACAGG AGC?rATOICT TITCTATrrT TAWACATGCT rTATATTTAA CACAAGCTCT 
TOGCTATCTT AGATAAACAG AAOTTCCCTA GCACTCCTrr TAGTCCATIG AACCCTTTAA 1080 
CATTTAAGCA AAATAATAAA CAtTTCTTTTG AGGnCCTTA ACAATGAAAC GTGTTCGAGT 1140 
GGCAGCAGCG GAATCCATCC nCTTCTCCT GGAGrrGTCCA AKAGTCCGTG GTCCPGACTA 1200 
60 TCTCACACAG ATOIGGCATT TrATGTGriGA TGCTCTAATT AAGGCCATTG GTACAGAACC 1260 
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540 
600 
660 



780 
840 
900 
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AG;CTCAG.^-C GTCCTCtrC^aS ^.-ATAATGCA TTCTTTTCCA AAGGTGAATA tTIT]XrrcrT 1320 
-JkAAAATATG TAT^J^GGTCG T.-rGmCATT TATTAGTCTT GCTAAAAAAA AAAAAAAAAA 1380 



10 



20 



50 



(2) INFCHMATICN FOR SEQ ID NO: 32 



ti) SE5Ur:::CE CHA?.\CTERISTICS: 

(A) LENGIr:: 2031 base pairs 
25 (3) TYPE: nucleic acid 

(C) STTRAJCEHNESS : doiible 

(D) TOPOLCGT/: linear 



(xi) SSQOSXS DESCRIPTICai: SEQ ID NO: 32: 

AGGATATCCA TSAITCITAA CCyGGCTATA TGTTAAAAAA AAATTGGAAA ATGCAATACA " 60 

TTnTTACTA TACAAJ^XTTAC AGAATCAGTA TGCAAGTriT ATTTATCAAA ATGTAATGGA 120 

25 TmTAAAGG CTSaGAAATT TTCCITATAC CTACCTTTTC AGTrATTTTA ATTATACCAA 180 

ATTATCAACT AGAATAGCTT C-JCCATATG AAATATAAAA TGAftGAGACA CCTAGGCTCT 240 

ATCAGGC:^ GGATTCmG AiCTTATrTC CACTTTAATT TCTCAGTGGA ACTTTAAGftGG 300 

30 

GCHGAGAAA-A CAAAGA;^ G^J^i^jy^CT^ CAACTAACAA AACCAGCACC ACATOSCTAG 360 

OTXnXSC-TA CTAAOTACCT TCTCAGGATT TTCCTCAGAT TGAAAAGCIT ATGAGGATTT 420 

35 CTTGGGA^ rCAATAJiCCT GCCTCrrAGT ACAGAGCTTT CCTGATGATA TTTACTCTTG 480 

AGCACA-nyiG CTICTAAAAC CTTAACTTrC TTTCTCCAGG AGGGTGGTGA TAGAAACAGA 540 

TOTTACniArr TAIGAACTCA TCrrcrcCTG AAATCrnGAG GGTQGGGAGA AAJ^^ACTTTA 600 

40 * 

agggaggaga gccatctatt ticttcxtaa agccacctct cagcagaatc GTCATGrrrr 

TCIGATCCAC CGCTCIGCIT CATGCCCAAG ATGACTTGCG AGGCAATCTC AGGAGCTGTG 
45 GACTTAAJCCR TTCCAA;tGCA CACTOrCTIT CrCAGCGTTC TCTGCAAGTC AGTAGGrrGTT 
AGTAIGGrrrG CAAACTICAC TOTCTCAGCA AAGTTGAACT GGGCTACCTC TCTACAQCTG 
TITCCrCAGA GGGAAAAATC TTGAGACCAG ATGGTGGAGC TCTGGAGTCA GflQGAAATGG 

CJIOTCTTCAG CACAAAGCTG CIGCITITAC TTCAGCCACT TCTGACATTT TTACATACCG 960 

AGCCKSGAT TRTGIGATTA TCPCAAATCA AATCACmG ATGGAGATAA ATAATCAAAA 1020 

55 cromTATA CTiCAnGATr togtoagaac agtaatogaa AATGcrrGrrG aaggacttct io80 

CA-rmTCGA GCmCCITC CftGAGTCCTG GCTGATTGGT GrrrCGCTGTT CATCPGAGCC 1140 

CCCAAAAGCA TTATTACTCA TACTTGCACA CAGTCAAAAG CGCAGACTGG ATOGATGGTC 1200 

60 
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780 
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TTTTATAAGG CATTTAAGGG TACACTACTG TGTTTCACTG ACCATACA1T TTTCTTAGCC 1260 

CCTCAA£?rAA TATAGCACAG AGTTATGAAT GACAATTCCC CTAACCATTC CTCTTCATAT 1320 

CTCCCTCrrC CCCTTACCAT CGTAATTCTC CAAACTGGTC ATAAAGGCAC TCICTGAAGA 1380 

TATTOGGGAC TCACATCTTA AGCTCTCACC TGGCTC3CAGT AGGAAAGGCC AJ^ACTGACGA 1440 
CAAAAAAAAA ATTCTrrATA AAGATGATAT GC3TAACATGT ATdTTGCCC TGGGTCTGC3G - 1500 

TGGCTCCAGT CACTTCTCAGA TTTACAAGCA TTTAGGAGCC TAGGTAAAAG CTGCTACTrAT 1560 

TCmTAAAA CTTACArrTA TCACTTGCAA TGATAGAAAA CTCCTTCCAA TTAAATGGCA 1620 

15 rrTTATAATA TTATOTCTOT ACTTCACAGT GTTAAAAATA CTCTCATACG TTATTGCATT 1680 

TCAienCAC AGAAACTGCA TTTTAACCAG TACTCTGGGT GCAATAAATA ATATGTAGAA 1740 

ATTTAAOTCX: TCCAATTCCA GCATATCCAG TGAGTTTTGA CAGTCTGnTT ATGTGGAATG 1800 

20 

•nTAAGGATA TACAATTGTA CTTTATATAA ATrGGTTCTT GTITCTTCTrA AATGTCACAT 1860 

GAAATAATTG TGCTGCTACA TTATACTGGA AATTAACAGG GGAAAAOSGA AGAGCTCTTG 1920 
25 GCTCCCITOA GGTrCTGCTA GrGCTTGrTAG GAGTOGTTAC AACIGAGCTT TTAGTAACCA 1980 
TTTAACCOTA TGTAAACTTG GriTTCTAATT AAAAAAAAAT TTCrmTCC A 2031 



30 



45 



(2) INFDRMATION FOR SEQ ID IK>: 33 



55 



(i) SEQUENCE CHARACTERISTICS: 
35 (A) LEbHSTH: 971 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: double 

(D) TOPOLOGY: linear 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

CGOCTCGGAA CTCGGCOCXXS GGACATCCAC GGGGCGOGAG TGACACGCGG GAGGGftGAGC 60 

ACyTCTTCIGC TGGAGCCGAT GCCAAAAACC ATGCATTTCT TATTCAGATT CATTGTTTTC 120 

TTTIATCrcT GQGQCCTTTT TACTGCTCAG AGACAAAAGA AAGAGGAGAG CACOGAAGAA 180 

GIGAAAATAG AACTTITCCA HXTTCCAGAA AACTGCTCTA AGACAAGCAA GAAGGGAGAC 240 

50 CTACTAAATG CCCATTAaXSA CGGCTACCTG QCTAAAGACG GCTCGAAATT CTACTGCAGC 300 

CGGACACAAA ATCAAGGCCA CCCCAAATGG TTTGTTCrTG GTGTrOQGCA AGTCATAAAA 360 

QGCCTAGACA TTCCTA3X3AC AGATATCJTGC CCTGGAGAAA AGCGAAAAGT AGrTTATACCC 420 

CCTTCATTTG CATACGGAAA GGAAGGCTAT GCAGAAGGCA AGATTCCACC GGATGCTACA 480 

TTCATTTrrG AGATTCAACT TTASOTCTG ACCAAAGGAC CACGGAGCAT TGAGACATTT 540 

60 AAACAAATAG ACATOGACAA TGACAGGCAG CTCTCTAAAG CCGAGATAAA CCTCTACTTG 600 
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CAAAGGGAAT TTGAAAAAGA TGAGAAGCXA 
GAAGATATTT TTAAGAAGAA TGACCATGAT 

5 

AATGTATACC AACACGATGA ACTATAGCAT 
TTTACTGTAC TTTATGTATA AAACTW^GTC 
10 CCCCTAriGAG AAGATATTTT GATCTCCCCA 
GCTGTTTTGC AAACTTAAAA AAAAAWWAAA 
CCOJATATGA T 

15 



294 

CGTGACAAGT CATATCAGGA TGCAGTTTTA 660 

GGTGATGGCT TCATTTCTCC CAAQGAATAC 720 

ATTTGTATTT CTACTTTTTT TTTTTAGCTA 780 

ACTTTTCTCC AAGTTGTATT TGCTATTTTT 840 

ATACATTGAT TTTQGTATAA TAAATGrPGAG 900 

AAAACTSGAG GGGQGCXXX?r ACCCAANTCX3 960 

971 



20 



(2) INFORMATION FOR SEQ ID NO: 34: 



(i> SBQUEtKE CHARACTERISTICS: 

(A) LENGTH: 1792 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 
25 (D) TOPOLOGY: linear 



30 



40 



(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 34: ' 

GAACCCCCTT TCTCCTGGTA AAQGffEAAGG GGGGGGATAA TGTITACCAC AGGTACGAAA 60 

TAGTCACTTT AACATTGAGA CCTCTGCCTC ATTGAATTCA GGTTmTAA GTACTTGAAA 120 

CTCTTCflGAT TCTCCTTATT TTAGTTTCTT TTTACATTTA TGAAGTAGAA AGCATTGrTT 180 

35 TOTAAACror TTTCAAAATA AATAGCCTAG TCTCTTATCC TCTTTAGCGT GGATTAAAGG 240 

TGAAOTTCTC CAAATOGGAG AGTGTTCACA GTAGATAGCT CAGATTGATT GAACACATTT 300 

GftGGAAGAGA CTCCTOCATG AGATACCAGC ATTTTTACAA ATACTTTTTA TGTACATTCr 360 

TTATTTIGTC ATnTCTCAA CCCTCTCCCC AAGCACATCT TCTTTCCnT TACTATGTCT 420 

ATOTAGGGAA AAACAAAACA AAAAATTGCA CTTACGTEAC ACTCCCAAAA TGrTGGCyrAAT 480 

45 CCGrGTCTTT CAAAAAACAT TTCTGrnTTT TGrriTTGTTT TGGTCAGTCC ATTGCATAAG 540 

1GACAA£?rTT GGGTGCTTCT GGCACGTATG TATGAAGCGG GAGGGGGATG ASAATTGCCT 600 

CJICCITCACT ARGCTCTAAA AGTAATTTAC ATGTAAGTAA AAAGGGAAAA TAGAATAGAT 660 

GCCAAAGICA TTTATTCAGT CCTTAGTTTT CTTATGTGGC ATTACTGCAT CTGCTAGrrTA 720 

GTGAGAAAGC ACCCTCAGCT TTTACTGCTC CCCTCCCTGC CIGCCAACAC ACTTGATGTG 780 

55 TGCAAACAGC CCTCAAGTAT CTGTCAGATG ACCTATATAA GGTATTGAAT AAGCJrATTCT 840 
TOTCACTITA GAAATOGACT GGATAAAACT TACTTGGrrrG TCATTATnT ATCTCATTTG 



50 



900 



60 



TCCTOTTACA TCCCCTAIGT TAAGATAATT ATATTGCCAC TAATAATCAA GATGCTAAAT 960 
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10 



15 



20 



25 



GACTATTACA ACTGGCTAAT ATCATTTTTr ATATACAAGG GTATGTGrTAT ATTTC3GAATT 
GRTATCAGAA ACTCATTTGT ACCCATTTGA GrGATATTGC ACAACAAACA CAGATAYCTA 
CAGACTCCCT TTTCATTTTC TOGTGTTCTT TATGATAATG ATCTrTGTAG ATTGGTTATT 
TCTCTACTTT ATCTCTAATA AACTTTGTAG ATCCTGTGAA CXMTACTIT GCCTAAATCA 
CTPGAGACTT GAGTCTTTAA TAACAAAGCA TCAATATTCA CTAAAGTCAA TCTCITITGA 
CTrrrCTOTGA CTTQGCTAGA AGCTCTTGAC ACTAAGGGAT TAGrrcTTAAT TTTCCCTGGG 
GGTOTTCCAC TAGGGCATTA CTGTATAATG ACTTGATGTT GCXZACATAGA CTTCAAGATA 
TATAATATTT TGAGGATTTT GTTGATTGGC CTATGnTTTA TTGCATAGTG TGAAACGTGT 
AAAGCTIGGT TAACCTGTAT ATAGATAGCT TATTGTTGAC TAGTTATAGT GTATTTAGGG 
TIGCCICTAA TATTTAAGCT TCTTTACTGA TGTGTGTGCT GGTAGGAACA TATAATTTTT 
GTACATTATA TTTACTGAGA TCTTGCCTTT TTTATTTTAC AAATACTTTG GAATTCCAAT 

CTcyrrrriTG cTTCCGnxsAG gattaatttg gaaaggtttt taatgacatt ccacpgattt 

CAGAimGC TTGAGATTCA CTTCAATAAA TTGTCCTGrrA TGTTCCAAAA AAAAATTAAA 

aaactcgagg ggggcccggt acccaanncg ccggatatga tcgtaaacaa tc 
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(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 896 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEENESS: doiible 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
AGTTCNANAC AACAGGACCT GfiGTCCTTGG GCAGCACCAG TAGGTTGCCC CYTGCYTCYT 
GCCAGCYTCA CYTCCCACYT TYTGCCCCTY TCGGGATGCC TTCGCAGACA GAGYTYTTCG 
CTGCCKJIGG TGGCCAXTCT TTGCTTTTGG TTYTCTTGCC CCTTGGCCTC CCTmTOTC 
CCCGGGCAGC CTTCTGTGAC CTGCCCTTTT CCCTCCCTTC dTTCCAGGA CAAGCACGCC 
GAGGAGCTGC GGAAAAACAA GGAGCTGAAG GAAGAGOCCT OCAGGTAAAG CCTAGAGGCC 
AAAGAACTTT CCAGGTCAGC CQGACAGCTC CAGCAGCTCC ACGTTCCAGG CAGCCTaaC 
CGCCGGCTGC GCTCCCAGCA CTGGGGnTTG GGGGGAGGGG GGTOGCCAAG GGGCGTTTCC 
TCroCTTTTG GrTOTTTGrCAC ATGTTAAGAA TPGACCftSTG AAGCCATCCT ATTTGTTTCC 
GQGGAACAAT GACGGGGTTGG GARAGGGGAG AGGAGAGAGT TTGGGAAAGG GAGATGGAGA 
AGAACrCAAG GACATTGCAA CCCTGCCCGG CGCAGATCTG ATTTTCACAT CTCTACCTGG 
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ACATIGAGCC TCCCAGGCAC CATGITCRGG AGftGATGAAA ACCAGGGCGG TAGAACTTCA 660 

GGOTGAAGGA CAGAGTCCTG GGTGGGGCAG CSGCK3CAGG GCGCACCAGA GAACCCAGCC 720 

5 

AGAGGGGOTG TGACTTACCAG TQGTGTTGCT TCCACCCTGC AGCAGGrTGGG ATGAGGTCTG 780 

TCKTKTKnt; TGAACCATCA TrTTTTGATC ATCATGACCA ATGAAACATT GAAAAAAAAA 840 

10 AAAAAAACTG GAGGGGGGCC CGTACCCAAN TCGCCGNATA GTCATCGTAA ACAATC 896 



15 (2) INFORMATION PGR SEQ ID NO: 36: 

(i) SEQUEtKZE CHARACTERISTICS: 

(A) r*ENGTH: 912 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
25 TCGACCCACG CGTCCGCTTCA GCCAGTCGCA TCCAGCCATG ACAGCCTTCT GCTCCCTGCT 60 
CCTCCAAGCG CAGAGCCTCC TACCCAGGAC CATGGCAGCC CCCCAGGACA GCCTCAGACC 120 
AGQGGAGGAA GACGAAGGGA TGCAGCTGCT ACAGACAAAG GACTCCATOG CCAAGGGAGC 180 
TAQGCCCGGG GCCAKCCGCG GCAGGGCTCG CTGGGGTCTG GCCTACACQC TGCTGCACAA 
CCCAACCCTC CAGGTCTTCC GCAAGAOGGC 0CTC3TTGGGT GCCAATOGTC CCCAGCCCTG 
35 ARGGCAGGGA AKCTTCAACCC ACCTC5CCCAT CTGTCCTGAG C3CATGTTCCT C3CCTACCATC 
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GGATCACYGT GCTTKGGTGG AGGTCTGTCT GCACTGGGAG CCTCARGARG GCTCTGCTCC 



240 
300 
360 



CtCCTCCCTC CCCGGCTCTC CTCCCAGCAT CACACCAGCC ATGCAGCCAG CAQGrTCCTCC 420 



480 



ACCCACTTOG CTATOGGAGA GCCAGCAGQG GTrCTOGftGA AAAAAACTQG TGQGTTAGGG 540 



600 



CCTTGCSTCCA GGAGCCAGTT GftGCCAGQGC AGCCACATCC AGGCGTCTCC CTACCCTQGC 

45 TCIGCCATCA GCCTTCAAGG GCCPCGATGA AGCCTTCTCT GGAACCACTC CAGCCCAQCT 660 

CCACCTCAGC CTrOGCCTTC ACGCTGTGGA AGCAGCCAAG GCACTTCCTC ACCCCYTCAG 720 

CGCCflCGGAC CTYTYTOGGG AGrTCGCCGGA AAGCTCCCSG GCCTYTGGCC TGCAGGQCAG 780 

CCCAACTCAT GACTCAGACC AGGTCCCACA CTGAGCTGCC CACACTCGAG AGCCAGATAT 840 

TITICTACTr rrTATKCCTT TOGCTATTAT GAAAGAGGTT AGTC7PGTTCC CTGCAATAAA 900 
55 CTTOTTCCTG AG 



60 (2) INFORMATION FOR SEQ ID NO: 37: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1382 base pairs 

(B) TYPE: nucleic acid 

5 (C) STRANDEENESS : doxible 

(D) TOPOLOGY: linear 

(xi) SE^^JENCE DESCRIPTION: SEQ ID NO: 37: 

10 AKmXSGCAC GAGCGGAGGC GAGGGAAACT RAGQCXX5AAA GrrTGrTGTGTC GTGTTOGCftG 60 

GAGGGCCTAG AAGGGAAAGA CTOTCTAGTG QGACAATCrrC ATATTATAAA TITCGAATGC 120 

-TCAATAGAAA ATTATAGATT TTCATATTGA AGGAAATGAA QCGAAGCYTA AATGAAAATT 180 

CAGCTCGAAG TACAGCAGGC TCTITGCCTG TTCCGTTGTT CAATCAGAAA AAGAGGAACA 240 

GACAGCCATT AACTTCTAAT CCACTTAAAG A3X5ATTCAGG TATCACTACC CCTTCTGACA 300 

20 ATTATCAITT TCCTCCTCTA CCTACAGATT GGGCCTGGGA AGCTGTGAAT CCAGAGTTKG 360 

CTCCTOTAAT GAAAACWJTC GACACCGGGC AAATACCACA TTCAGTTTCT CGTCCTCTGA 420 

GAAGTCAAGA TTCTCTCTIT AACTCTATTC AATCAAATAC TGGAAGAAGC CAGQGTGGrrT 480 

GGAGCTACAG AGATGGTAAC AAAAATACCA GCTTGAAAAC TTGQy^AAA AATGATTTTA 540 

600 



15 



25 



35 



45 



AQCCTCAATG TAAACGAACA AACTTAGTGG CAAATGATQG AAAAAATTCT TGTCCAATGA 

30 C?rrCGGGAGC TCAACAACAA AAACAATTAA GAACACCTGA ACCTCCTAAC TTATCTCGCA 660 

ACAAAGAAAC CGAGCTACTC AGACAAACAC ATTCATCAAA AATATCTQGC TGCACAATGA 720 

GAGGGCTAGA CAAAAACACTT GCACTACAGA CACTTAAGCC CAATTTTCAA CAAAATCAAT 780 

ATAAGANACA AATCTTGGAT GATATTCCAG AAGACAACAC CCTGAAGGAA ACCTCATrGT 840 

ATCAGITACA CTTTAAGGAA AAAGCrAGTT CTTTAAGAAT TATTTCTGCA GTTATTGAAA 900 

40 GCATGAAOTA TIGGCGTGAA CATGCACAGA AAACTGTACT TCTTrTTGAA GTATTAGCTG 960 

TTCTTCAITC AGCTCTTACA CCTGGCCCAT ATTATTCGAA GACTTTTdT ATGflGQGATG 1020 

GGAAAAATAC TCTGCCTICT GTCTnTATG AAATCGATCG TGAACTTCCG AGACTGATTA 1080 

GAGGCCGACT TCATAGATCT GTTGGCAACT ATGACCAGAA AAAGAACATT TTCCAATGTG 1140 

TTTCTOICAG ACCQGCGTCT GrTTTCTGAGC AAAAAACTTT CCAQGCATTT GTCAAAATTG 1200 

50 CAGATOITGA GATCCACTTAT TATATTAATG TGATGAATGA AACTTAAGTA GTGATAAAAG 1260 

GAACTTTAGC ATAAATTATA GCAGTTTTCT GTTATTGCTT AATTTACCAT CTCCATACyTT 1320 

TTATAGCTAC TATICTATTT CACrTGTTGA ATTAAAGTAT TTGAATTCTr TTAAAAAAAA 1380 
AA 
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(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LHXJTH: 872 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECtt^ESS: double 

(D) TOPOLOGY: linear 

(xi) SBC^JENCE DESCRIPTICN: SBQ ID NO: 38: 

GGGCTACTTC AAAGCCCTGG GCCTTATTTC TTCAQGTAAA AAAATATAAA GTCAGATCTC 60 

ATCCCOQCTG GCCATGCTCT TAGACCCTTT CATCCTTCTC TTCTGCCTCT TCTCAACAGC 120 

15 TQCCCACTCC TGTTTGGAAT TCATATACAT ACAGTTCTAA TACTGATGTA TTTACCCTCA 180 

TAAGCCACTC AACCCAGAAT CrTATTTGAA TTATAATCCA GAAACATCAG GTGACGTGTG 240 

AGACTACTGT ATCAGAAAGA GACAGTTTAA GGGrTCAGTCC AATGGAAAAA AGAGTTCTCA 300 

GAGCTTTCTT TAGCTTATTC TCATCAAAiSA GCTTTCTCTG CAGAAGGAAC CTACTGC?nC 360 

CrCCTTTCCA GTCCTAGAAA TCCTGACCTA GAGTGGCTTA ATCCTGCTAG CACCTCTCTC 420 

25 TCGCACTCTG GTOCCAAATC ACTCCAQGAA CTGQCSCCATG ATGTGGTGGG AATGACCTTA 480 

CCCTGAGCAT GTCACTCATG CATTGAACAA CAGCTAAGAG CAGAGCTTAG AGCTTAGAGC 540 

TOGGCCCTCT AAGCTGAGAG GAATCACATC CTGCAGAAGT CTGTCCTGAG AAGCAGGTAC 600 

TCCTOICACA GCAGAGACAC ACTGGATACC TGAGTAACAA TAATACAAGA CAGGACGTGG 660 

GMACAGCAAA AGATTTOQGT GTCAGAAGAR GCCGAGAACA CTTYCAGGCA GGAACATTCA 720 

35 RAKrrorrcr tqgaggaart aggcmcsaag gctgggcagg atttcmoggg gcagagatgg 780 

AGCAAGCAAT TGAAATGAAA GCCATGGCAT GGGAAAAGGA GCACTGGCCA CAGGGAGIGC 



20 
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AACGTTGTGA TGCAAGGCCA CTGTGGAGCC AT 872 
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(2) INFORMATION FOR SEQ ID NO: 39 



(i) SEQUENCE CHARACTERISTICS: 

(A) LQlGnH: 812 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 
50 (D) TOPOLOGY: linear 



(xi) SBQUEIICE DESCRIPTICN: SBQ ID NO: 39: 
GGCAGAGGCT CACCCCAGCA GAGATTGAGG GGGAACOGTG ATGAAATTTT TAAGTATTCT 60 
GCTTGA1GAT AATAATTTTY CTCTTATGTT AATGTTGGCT CCGTTTGGGT GTTTAGCTTr 120 
TCAAAGGAOT ATGAAAATQC GGAATGGOGC TTTQGGGCTT GAGGAGGTGT GATCTCTACT 



180 



60 CTTTAAAAAA TTTAATTGCA CAAATAGAAA TAATTCACCC ACATTATTGA ACCCCACTAA 240 



wo 98/54963 



PCr/US98/11422 



299 



40 



50 



CCTGTCAAAG TCAGAaTOCCA AAAAAGCCGC CTCAAAGACG CTGCTGGAGA AGAGrrCflGTT 
TTCAGATAAG CCGGTGCAAG ACCGGGGrTTT GGrTGGTGACG GACCTCAAAG CTGAGAGTGT 
GOTTCTTCAG CATCGCAGCT ACTQCTCQGC AAAGGCCCGG GACAGACACT TTGCTGCSCSGA 



55 GAGCAAffTTC ACACAGATCT CACCCGTCTG GCTQCAGCTG AAGftGAOSTG GCCGTCAGAT 



60 



TCCCAAGGGC CTCCACATAG TGCCTCQGCT CCTOITTGAG GACTOGACTT ACGATGATTT 



AGCATATCCT TTTTCTCCAT ATTCCnTCC TGCTGCXTTC GTOTGTACXZA TTATTACTCA 300 

CTICTGAnr GflGCTCCyrTC CACTTAAAGT CATTCATAGA TACTTTTC5CG TCGTGTTKGA 360 

^ ATATTTATTC AATITCTATr CTGTC3TTTTA CTTAATTACT TTATTATOGA ACCTTTACAC 420 

AGCTCTOGTC TACrrcnTCT TTGAAAAGTC TTATGTTGAC CACCATCACT GAGCATATAG 480 

10 CTTTPrcCTT ATrrcCTTGG GATAATTACC CX3AAGTGGAA ATACCGAATC AAACTTCPGT 540 

'ITlCTrrCTT TGGCACTATT ATATAAATTG TTTTCCAAAC AAGGCATGTT TACAATAGAC 600 

MTTTTCAAA ATCTOGGTAT TTGTCCTATT TTGCTCTCTG TATGCAGAAT TCAGCGGGGT 660 

GCCAACnCGT 'ITiVI'GTCTG GGTTGAGAGA CAGGCTGTGC AGCCCACTCT TGCATAQGAC 720 

TAACTACTAC AAATCATGCT GAGACCXSAGC TATTTTTGCT GCTTAGARGC nTGCAGCCT 780 

20 TGAGrTAAGTP TCCaiCATCTG GAAACNTTCai AA 812 



25 (2) INFORMATION FOR SEQ ID NO: 40: 

<i) SEQUENCE CHARACTERISTICS: 

(A) IS^GTH: 1515 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEENESS : double 

(D) TOPOLOGY: li n ear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
35 AATTCGGCAC GAGGGAAATT CAAGCACTTT TCCTAAAAGA AGGGGGAATG GATGCTGAAA 60 
CAACACGIWr CCCACAAAGG GAGCAGACAC TGGGCTTGTG AAGCTGCCCC ATACCTTCCC 120 
CACAGAACTG GGGTCCGGCC TCCCTGACAT GCAGATTTCC ACCCAGAAGA CAGAGAAGGA 



180 



GCCAGTGCTC ATOGAATOGG CTGGGGTCAA AGACTGGGTG CCTGOGAGCT GAGGCAGCCA 240 
CCGTITCAGC CTOGCCAGCC CTCTGGACCC CGAGGTTGGA CCCTACTGTG ACACACCTAC 



300 



45 CATGCGGACA CTCTTCAACC TCCTCTGGCT TGCCCTQGCC TGCAGCCCTG TTCACACTAC 360 



420 
480 
540 



TCTACRSGGC TATOTCACTC CATGGAACAG CCATGGCTAC GATGrTCACCA AGGTCTTTGG 600 



660 



GTTTGAGCyrC ACGQGCCTCC ACGACGTGGA OCAAGGGTGG ATGCGAGCTG TCAGGAAGCA 720 



780 
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CCGGAACCyrC TTAGACAGTG AGGATGAGAT AGAGGAGCTG AGCAAGACCG TGGTCCflGGT 840 

GGCAAAGAAC CAGCATTTCG ATGGCTPCGT GGTGGAGGTC TGGAACCAGC TGCTAAGCCA 900 

GAAGCGCGTC ACCGACCAGC TGGGCATGTr CACGCACAAG GAGTTTGAGC AGCTGGCCCC 960 

CGTGCTCGAT GGrmCAGCC TCATGACCTA CGACTACTCT ACAGCGCATC AGCCTGGCCC 1020 
TAATGCACCC CTCTCCTCGG TTCGAGCCTG CC3TCCAQGTC CTGGACXXX3A AGTCCAAGTG ' 1080 

GCGAAGCAAA ATCCTCCTGG GGCTCAACTT CTATGGTATG GACTACGCGA CCTCCAAGGA 1140 

TGCCCXTTCAG CCTCTTGTCG GGGCCAGGTA CATCCAGACA CTGAAGGACC ACAGGCCCOG 1200 

GATGGTGrrGG GACAGCX^^ YCTCAGAGCA CTTCTTCGAG TACAAGAAGA GCXXXMTQG 1260 

GAGGCAOGrrC GTCTTCTACC CAACCCTGAA GTCCCTGCAG GTGCGGCTGG AGCTGGCCCG 1320 

QGAGCTCGGC GTTGGGGTCT CTATCTGGGA GCTGGGCCAG GGCCTGGACT ACTTCTACGA 1380 

CCTGCTCTAG GTCQGCATTG CGGCCTCCGC GGTGGACGTG TTCTTTTCTA AGCCATGGAG 1440 

TGAGTCAGCA GCTGrTGAAAT ACAGGCCTTC ACTCCGTTAA AAAAAAAAAA AAAAAAAAAA 1500 

AAAAAAAAAA AAAAA ^515 



30 (2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 704 base pairs 

(B) TYPE: nucleic acid 
35 (C) STRANDEDNESS: double 

<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

40 AAGATGGTGG CGCCCAGAGC TTCGCTCTAT GCTQCTCCCC TGAGAGAGGC GTTTCCflTCA 60 

ACCAC?rTTTG CAAGGAGTTC AATGAGAGGA CAAAGGACAT CAAGGAAGGC ATTCCTCTGC 120 

CTACCAAGAT TTTAGTGAAG CCTGACAGGA CATTTGAAAT TAAGATTGGA CAGCCCACTG 180 

45 

TTTCCTACrT CCTGAAQGCA GCAGCTGGGA TTGAAAAGQG GGCCCGGCAA ACAGGGAAAG 240 

AGGTGGCAGG CCTGGTGACC TTGAAGCATG TGTATGAGAT TGCCCGCATC AAAGCTCAGG 300 

50 ATCAGGCATT TGCCCTGCAG GATGTACCCC TGTCGTCTGT TGTCCGCTCC ATCATCQGGT 360 

CTGCCCGrnC TCTGGGCATT CGCGTGGTGA AGGACCTCAG TTCAGAAGAG CTTGCAGCTT 420 

TCCAGAAGGA ACGAGCCATC TTCCTGGCTG CTCAGAAGGA GGCAGATTTG GCTGCCCAAG 480 

55 

AAGAAGCTGC CAAGAAGTGA CCCTTGCCCC ACCAACTCCC AGATTTCAAA GGAGGTAGTT 540 

GCAAAAGCTG TCCCCAAGGG GAGGAAGGAG GTCACACCAA TATGATGATG GTTTTCATGA 600 

60 CTTTGAA'TCA TATATTTTTG TACATCTAGC TGTATCGAGG CATCAGGCCT GAATAAACAT 660 
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CXTTTTCTTAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAA 
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(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1094 base pairs 

(B) TYPE: niicleic acid 

(C) STRANDEENESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

GGCAGCTTTC TTACAAACGC ATCCTTCTGA AATGITGCTT CAAATTCATC CTCTGCTCCC 60 

CAGTCCCACr ATTCCACACA TACTGTTACT GTTTCTTTAT CCTACTTTCT CAATTTTGGA 120 

ACATAGTTGC AGTTACTGCA TTGAATACCT GTQGGTTTGC CTGTTGTTCT GTCTGTCTCT 180 

GTGGTrCTTG TAATANTC3GA TCCCAGAGAT AAAATGGflCA GTTGINATGC ACAGTTAATT 240 

CAGAAACTAG ACCTTACTTG CTGTGTGAAA TACCAACTAA ATTCTCAGTG AACTCAQCTG 300 

ANCTTTATCT CCTTTTGTIT CCCCAATTTA TAATTTCAGT TCAGGCCCftb AAAGATGGAA 360 

TOCCAGCTAA GAAATACAAG TTACACCCTG TACTAGCAGC CCATGTGTGC ATGTTCTTTA 420 

AGTGCTCTTG CAGCTATGTC ATTrATATTG ArrTCCCTGT ATTATTATAA GCAAAGCAAA 480 

TTTCAGGAAA AAAACCCATA ATACCACACC TCATTTTTTT CAAGTAATAG GGTCATAAGT 540 

CTCATYCTYC ATATAATATG TTGAGTATGC AGTATATTAT GTGTTAGGCT CTGGANAGGC 600 

AGAOGTTAGA TCATGIWACA GATQ^TATCK GATTAGGCAG ATAAACAGTA TTTTAACCTT 660 

TTCCTTATTA TATGTAACTT GCTTTCAGGT TTTTTAATGT TACTATTATG TCTTTAATAT 720 

ATTATCTTTA TTTGTACTTT TGTATACAGA GTGATTTTCC TTTnTAAAA AAAATTGTGT 780 

CTTTAGGATG GATTCCAAAG ATGTOGAATC AGTAQGrTTTA AGGAATATQG ATATnTGGC 840 

TGGCAAGGTG GCTCACACCT GTAATCCCAG CACTTTGGGA GGCTGAGGTG GGTGGATCAC 900 

CTGAAGTCAG GAGTTCGAGA CCAGCCTGAC CAACATGGOG AAAOCCTGTT TNTACTAAAG 960 

ACACACWWAA AATTRGCCAG TQGrTOGTGGC A3X?rGCTTGT AGTCCCACTT AGCTACTCGA 1020 

GAGGCTGAGG CAGGAGAATC GCTTGAACCC GGGAGGCAGA QGTTGCAGTG AGGCAAGATG 1080 

GCACCTCTAC ACTC 1094 



55 



60 



(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 
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'A) 1321 base pairs 

'3) T*fr3: nxicleic acid 
■;C) STriUiCEDNZSS : double 
:d) linear 

5 

'xi) SB^CH:^:^: 3HSCRI?TICN: SEQ id NO: 43: 
TGGCTT.-JSGC CATCACCCTT CCCTTGGCTG GAACTACrGG ACAGACCCTT TTGAGATGTG 60 

10 CCTGIOrGC TGTSGAGATG ICTGrTAGTGG TCTTAGCTCT TTGTrGAGCT TGTGTGTGTG 120 

rraz^porx: -Casctotat GCTGAAArrG ggcgtgtgtt ggagggcttc ttagctcttt 180 

GCjrSPC?^^^ -^A: ■,'.X,V A TG rCTPirGTATC ASCTGAATGT TGCTGGAAAT AAAACCTTGG 240 

15 

rrrGI^gAGG CieiTrTTTG TGGGAACTTAA GTAGGGGAAA AGGTCTTTGA GGGTTCCTAG 300 

GCTCCTTT^T ACAACAGGAA AATGCdCAA AGCTTTGCTT CCCAGCAACX: TGGGGCTCGT 360 

20 TCCCAGTGCC -TGCrCCTQCC CCITCCTGGT TCTTATCTCA AGGCAGAGCT TCTGAATTTC 420 

MXK:::n:Tc?-jr tccagagccx: icrrGTOGCC aggccttcct ttgctggagg aaggtacaca 480 

GGG^GAAGCT GATGCTGTAC TTGQGGGATC TCCTTGGCCT GTTCCACCAA GTGAGAGAAG 540 

25 

GTACTTACTC .VJJ A CCTCC TCTTCAQCCA QCTTGCAITAA CAGACCTCCC TACAGCTGTA 600 

GGAACTACIX; rCCCAG?J3CT G?dGGCAAGGG GAITTCTCAG GTCATTTC3GA GAACAAGTGC 660 

30 TTTASrAGTA GOTTAA^OTA GTRACTGCTA CTGTATTTAG TGGGGTGGAA TTCAGAAGAA 720 

ATTTGAAG^JC CAG.:^:rCATGG GTGGTCTGCA TOTGAATGAA CAGGAATGAG CCGGACAGCC 780 

TQGC7CTCAT T3CmTCTTC GTCCXTCATTT GGACCCTTCT CTGCCCTTAC ArrTTTGTTT 840 

CTCCATCIAC CACCATCCAC CAGTCTATTT ATTAACTTAG CAAGAGGACA AGTAAAGGGC 900 
CCrCTTGGCT TGArTTTGCT TCTTTCTTTC TGTGGAGGAT - ATACT AAGTG CGACTTTGCC 960 

40 CTATCCTATT TQG^AATCCC TAACAGAATT GAGTTTTCTA TTAAGGATCC AAAAAGAAAA 1020 

ACAAAATSCT AATGAAGCCA TCAGTCAAGG GTCACATGCC AATAAACAAT AAATTTTCCA 1080 

GAAGy^ilSA AATCTAACTA GACAAATAAA GTAGAGCTTA TGAAATGCTT CAGTAAGGAT 1140 

45 

GftGCTTOrrG mvi ' iCTi T ' ivriTmiTr TGKrmTTA aagacggagt crcGcrcrGrr 1200 

CACTCAGGCT GGAGTGCaGT GGTAT6ATCT TGGCTCACTG TAACCTCCGC CTCCCGGGTT 1260 

50 CAAGCCAITTC TCCIGCCTCA GTCTCCTGAG TAGCTGGGAT TACAGGriGCG TGCCACCATG 1320 

CCTGGCT.-AX nV-CJlVri T TTAGTAGAGA CAGGGTTTCA CCATGrTTGGT CGGGCTGGTC 1380 

TCAAACTCCT GACCTCTTGA TCCGCCTGCC TTGGCXrTCCC AAAGTGATGG GATTACAGAT 1440 

GTG.iGCCijCC CX?r^3CCCTAG CCAAGGATGA GATTTTTAAA GTATGTrTCA GTTCTGTGTC 1500 

A3X3GTTGGAA GAC^JGAGTAG GAAGGATATG GAAAAGGTCA TGGGGAAGCA GAGGTCATTC 1560 

60 ATCGCTCICT GAAITTCAGG TGAATGGTTC CTTATTGTCT AGGCCACTTG TGAAGAATAT 1620 
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GACTTCAGITA TTGCCAGCCT TGGAATTTAC TTCTCTAGCT TACAATQGAC CTTTTGAACT 
GGAAAACACC TTGTCTGCAT TCACTTTAAA ATGTCAAAAC TAATTTTTAT AATAAATGTT 
TATTTTCACA TTGAAAAAAA AAAAAAATTT AAAAACTCGG GGGGGGCCCS G-IAOXCATT 
NGCXrCTAAG GGGGGGGGTT T 



1680 
1740 
1800 
1821 



10 



20 



25 



30 



(2) INFORMATION FOR SEQ ID NO: 44: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1024 base pciixs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOIOGY: linear 

(xi) SEQUENCE DESCEaPTICN: SEQ ID NO: 44: 

GQQGCACAGT TGAAGAAGCG ACCGAGQGAC TGGGAGrTCGT TAGTCAGGAT GACGCGQCAT 60 

GGCAAGAACT GCACCGCAGG GCCGTCTACA CCTACCACGA GAAGAAGAAG GACACAGCGG 120 

CCTCGGGCTA TGGGACCCAG AACATTOGAC TGAGCCGGGA TGCCGTGAAG GijCTTCGACT 180 

GCTGTTGTCT CTCCCTGCAG CCTTGCCACG ATCCTGTrGT CACCCCAGAT GGCTACCrGT 240 

ATCAGCGTGA GGCCATCTTG GAGTACATTC TGCACCAGAA GAAGGAGATT GCCCGGCAGA 300 

TCAAGGCCTA CGAGAAGCAG CGGGGCACCC QGCGCGAGGA GCAGAAGGAG CTTCAGCGGG 360 

35 CGGCCTCGCA GGACCATGTG CQGGGCTTCC TGGAGAAGGA GTCGGCTATC GTGAGCCGGC 420 

CCCTCAACCC TTTCACAGCC AAGOCCCTCT CGGGCACCAG CCCAGATGAT GTCCAACCTG 480 

GGCCCAGTOT GGGTCCTCCA AGTAAGGACA AGGACAAAGT GCTGCCCAGC TTCTC3GATCC 540 

40 

CGTCGCTGAC GCCCGAAGCC AAGGCCACCA AGCTGGAGAA GCCGTCCCGC ACGGTGACCT 600 

GCCCCATGTC AGGGAAGCCC CTGCGCATGT OGGACCTGAC GCCCGTGCAC TTCACAOCGC 660 

45 TAGACAGCTC CCTQGACCGC CTrGGGGCTCA TCACCCGCAG CGAGCGCTAC GrTGTGTGCOG 720 

TGACCCGCGA CAGCCTCAGC AACGCCACCC CCTQCGCTGT GCTC3CGGCCC TCTGGGGCTG 780 

TGGTCACCCT OGAATCCGTG GAGAAGCTGA TTCQGAAGGA CATGGTGGAC CCTGTGACTG 840 

50 

GAGACAAACT CACAGACCGC GACATCATCG TGCTGCAGCG GGGCGGTACC GSTTCGCGGG 900 

CTCCGGAGTC AAGCTCCAAG CGGAGAAATC ACGGCCGGTG ATGCAGGCCT GAGTGrPGTGC 960 

55 GGGAGACCAA ATAAACCGGC TTGGGTGCGC AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1020 

AAAA 1024 



60 
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(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTTH: 983 base padrs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : doi±>le 

(D) TOPOLOGY: linear 

10 (xi) SEQUENCE DESCRIPTION: SBQ ID NO: 45: 

CGACACGGCT GCGAGAAGAC GACAGAAGGG CCCGACCGCG AGCCGTCCAG GTCTCAGTGC 60 

TOTGCCCCCC CCAGAGCCTA GAGGATGTTT CATGGGATCC CAGCCACGCC GGGCATAGGA 120 

15 

GCCCCTGGGA ACAAGCCGGA GCTGTATGAG GAAGTCAAGT TGTACAAGAA CGCCCGGGAG 180 

AGGGAGAAGT ACGACAACAT GGCAGAGCTG TTTGCGGTGG TGAAGACAAT GCAAGCCCTG 240 

20 GAGAAGGCCT ACATCAAGGA CTGTGTCTCC CCCAGCGAGT ACACTGCAGC CTGCTCCCGG 300 

CTCCTOC?rCC AATACAAAGC TGCCTTCAGG CAGGTCCAGG GCTCAGAAAT CAGCTCTATT 360 

GACGAATTCT GCCGCAAGTT CCGCCTQGAC TGCCCGCTGG CCATGGAGCG GATCAAGGAG 420 

25 

GACCQGCCCA TCACCATCAA GGACGACAAG GGCAACCTCA ACCGCTGCAT CGCAGAOGTPG 480 

GTCTCGCTCT TCATCACGGT CATQGACAAG CTQCGCCrGG A6ATCCGCGC CATQGATGAG 540 

30 ATCCAGCCCG ACCTGCGAGA GCTGATGGAG ACCATGCACC GCATGAGCCA CCTCCCACCC 600 

GACTITGAGG GCCGCCAGAC GGTCAGCCAG TGGCTGCAGA CCCTGAGCGG CATGTCGGCG 660 

TCAGATGAGC TQGACGACTC ACAGGTGCGT CAGATGCTGT TCGACCTGGA GTCAGCCTAC 720 

35 

AACGCCTTCA ACCGCTTCCT GCATGCCTGA GCCCGGGQCA CTAGCCCTTG GACAGAAGGG 780 

CAGAGTCTGA GGCGATGGCT CCTGGTCCCr TGTCCGCCAC ACAGGCCGTG GTCATCCACA 840 

40 CAACTCACTC TCTGCAGCTC CCTGrTCTGGT GTCTGTCrrT GGrTGTCAGAA CTTTTGGGCC 900 

GGGCCCCTCC CCACAATAAA GATGCTCTCC GACCTTCAAA AAAAAAAAAA AAAAAAAAGR 960 
KGSGGCCGGT CCCCANTCCC CCC 

45 



50 



(2) INFORMATION FOR SEQ ID NO: 46: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2421 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
55 (D) TOPOLOGY: linear 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 
CCGGCTGATC GCTGCCQCTC CGCCAATACA ATAGAGCCAK CCACTACCAG CAGCCTQGCC 



983 



60 
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CTCTTCCTCC TTCTCCAGAG AGACCAATCC AGCCGAACTC GGGGTTTGCC TX3AGGAGAAG 
GAGGAAGTCA CCATGGACAC AAGTGAAAAC AGACCTGAAA ATGATGTTCC AGAACCTCCC 
ATGCCTATTG CAGACCAAGT CAGCAATGAT GACOSCCCGG AGGGCAGTCT TGAAGATGAG 
GAGAAGAAAG AGAGCTCGCT GCCCAAATCA TTCAAGAGGA AGATCTCCC3T TGTCTCAGCT 
ACCAAGGGGG TGCCAGCTGG AAACAGTGAC ACAGAGGGGG GCCAGCCTGG TCGGAAACGA 
CGCTCGGGAG CCAGCACAGC CACCACACAG AAGAAACCTT CCATCAGTAT CACCACTGAA 
TCACTAAAGA GCCTCATCCC CGACATCAAA CCCCTGGCGG GGCAGGRGGC TCnTGTGGAT 
CrrCATGCTG ATGACTCTCG CATCTCTGAG GATGAGACAG AGCCJrAATQG CGATGATGGG 
ACCCATGACA AGGGGCTCAA AATATCCCGG ACAGTCACTC AGGTAGrTACC TGCAGAGGGC 
CAGGAGAATC GGCAGAGGGA AGAAGAGGAA GAAGAGAAGG AACCK5AAGC AGAACCTCCT 
OTACCrcCCC AQGTGTCAGT AGAGGTOGCC TTGCCCCCAC CTGCAGAGCA TGAAGTAAAG 
AAAGTGACPT TAGGAGATAC CrTAACTCGA CGTTCCATTA GCCAGCAGAA GTCCGGAGTT 
TCCATTACCA TTGA3XSACCC AGTCCGAACT GCCCAGGTGC CCTCCCCACC CCQQGGCAAG 
ATTAGCAACA TTGTCCATAT CTCCAATTTG GTCCGTCCTT TCACTTTAGG CCAGCTAAAG 
GAGTTCTTGG GGCGCACAGG AACCTTGGTG GAAGAGGCCT TCTGGATTGA CAAGATCAAA 
TCTCffTTGCT TTGTAAOGTA CTCAACAGTA GAGGAAGCTG TTGCCACCOG CACAGCTCTG 
CACGGGOTCA AATGGCCCCA GTCCZAATCCC AAATTCCTTT GTGCTGACTA TGCCGAGCAA 
GATGAGCTGG ATTATCACCG AGGCCTCTTG GrTOGACCGTC CCTCTGAAAC TAAGACAGAG 
GAGCAC3QGAA TACCACGGCC CCTGCACCCC CCACCCXXAC CCCCGGTCCA GCCACCACAG 
CACCCCCGGG CAGAGCAGCX5 GGAGCAQGAA OSGGCAGTCC GGGAACAGTG GGCAGAACGG 
GAACGGGAAA TGGAGCGGCX5 GGAGCGGACT CCATCACaGC GTGAATGGGA TCGQGACAAA 
CTTCGAGAAG GGCXXXXTTTC (XGATCAAGG TCCCGrraACC GCOQCCGCAA GGAACGTOCG 
AAGTCTAAAG AAAAGAAGAG TGAGAAGAAA GAGAAAGCCC AGGAGGAACC ACCTQCCAAG 
CTCCTOGATG ACCTTTTCCG AAAGACCAAS OCAGCTOXT GCATCTATTG GCTCCCACTG 
ACTCACAGCC AGATCGTTCA GAAAGAGGCA GAGOGGGCCG AACGGGCCAA GGAGCGGGftG 
AflGCGGOGAA AGGftGCAAGA AGAAGAAGAG CAAAAGGAGC GGGflGAAGGA AGCCGAGCGG 
GAACGGAACC GACAGCTGGA GCGAGftGAAA CGrPCGGGAGC ACAGTCGGGA GAGGGACAGG 
GAGAGAGAGA GAGAAAGGGA GCGGGACAGG GGGGACCX3AG ATCGGGATAG GGAAAGGGAC 
CGAGAACGAG GCAGGGAAAG GGATCGCAGG GACACCAAGC GCCACAGCAG AAGCCGGAGT 
CGGAGCACAC CTGTGCGGGA CXX3GGGTGGG CGCCGCTAGC TQGGAAAACA CTAGAGCTGC 



120 
180 
240 
300 
.360 
420 
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540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
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1440 
1500 
1560 
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AGGTACCAGC CACTCGGCCC CAGGGGGTTA TGGCCACAGA GGGATAGGCA CAGTCTCOjC 1920 

CACCCK3GAG CXAAGGGTCT TTCACATCAC CTATCCCTAC ATACATACCA AATGCAAAJ^G 1930 

TGGCCATCCT TTTOCCCCCA AACACACCCC CTTAACCTAT CTCTTGGGAjC TTAGCCCGAC 2340 

CCTCCCTCTC ATTTCCCATT AAGTCTGAGA QGCAAGAGCT AGGTTAGGCA AGGAGGTGG? 2100 

TCGCCAGAGA TGGGGAACAG CCAGGTGCCC CAGTCCTCTG ATTTTTCCTC CATCCTGCIT 2160 

ACCACCICCX: TGGGTACTTA CAGCCTTCTC TTGGGAACAG OCGGGGCCAG GACTGGGTCA 2220 

CCTATGAGCT GAATCAGCAT CTCCTCCTGA GrTCCCAGGGC CCCTGCAC?rr CCCAGTCTCT 2280 

15 TCTGrrCCTGC AGCCCTTGCC TCTTTCCCAC AGGTTCCACT TTATATCCAC CTTTTCCTTT 2340 

TOTTCAArrT TTAITTTTAT TTTTTTTATT ATTAAATGAT GTGGrCrATG GAAAAAAAAA 2400 

TAAAAATCTG ACTTAGTTTT A 2421 

20 
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(2) INFORMATIC3N FOR SEQ ID NO: 47: 



(i) SEQUENCE CHARACTERIStlCS : 

(A) LENGTH: 840 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: double 
30 (D) TOPOrOGy: linear 

(xi) SEQUENCE nESCRIPTION: SEQ ID NO: 47: 

CTCAAACrcC TGAGCTGAAG CGATCTACCT GCCTCAGCTA GGATTACAGG TGTGAGCCAC 50 

CGCACCCAAC CTCAATAAGC KTATTTGATA AAAKATATGC AAGCTCCCIT TATKCACTTT 120 

TCATTCAGAA TCTTTAGTAA TTKyTATTCT TTTTCAGATT TTCAGCCCAA TATATCTCC/ 130 

40 TOCCCftCTGT GTCACTGTAT TCTACCTAWA CATCATCACG TGTTTCrGCT ATTGGCTGnA 240 

TGATGGAACA CTCCGGCTCA TTTTCCTCAA AACTGCCGAT AGTOCATAGA RTGCTGOGAT 300 

GGAAACCAGA ARCTTTGAAT TCAAGCCTTG GTTCTGCCTT GTTTTTGCTT GGGTGGCCrr 3S0 

GAGTCAGCCA CATACCTTTT AAAATCTCAA TTTftTTAGAA ATTATTCCAA ATCAAAATCA 420 

AATGAGAAGG TATATACAAA AGTGCTTTAT CCCACAATAA ACTATTCAAG AGAGAGCAAA 480 

50 GGAGAGGACA TTTACTCAAC ACXTTCCTAAA AGGCAGCCAG TGAAATTAC^ CATTTTftTTT 540 

AATCCTCCTG GCAACTCTGA GflCTAAAGCA TTATTAATCC CATTTTGGCT GTTTAAAGAA 500 

ATTAITTGCA CTAGATTCCA GCTGTAGTrr AGYTTCAGAA AAAAAAATCC TGAGATGTGA 560 

ATTCftCAGCT TTCTGGGITT AAAGCCCAAG CTCTATCACA TCATGCTATT ArrGTTACVT 720 

TACTGCTAGT TCTATGAAAA GAAATACTAA TTTATGAAAT ACATCTTATC CAAAAAAAAA 780 

60 AAAAAAAAAC TQGGAGGGGG GGCCCGTACC CAAATCGCCG GATAGTCATC GTAAACAATC 340 
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5 (2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTEEaSTICS : 

(A) LENGTH: 2432 base pairs 

(B) TYPE: nucleic acid 
10 (C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

15 GGCACGAGGC CCGGAACGCT GAGGAAGGGC CCGTCCCGCC TTCCCCGGCG CGCCATGGAG 60 

CCCCGGGCGG TTCCAGAAGC CGTGGAGACG GGTGAGGAGG ATGTGATTAT GGAAGCTCTG 120 

CGOTCATACA ACCAGGAGCA CTCCCAGAGC TTCACGTTTG ATGATGCCCA ACAGGAGGAC 180 

CGGAAGAGAC TGGCGGASTG CTGGTCTCCG TCCTGGAACA GGGCTTGCCA CCCTCCCACC 240 
CTCTTCATCTG GCTQCAGAGT GTCCGAATCC TC3TCCCGGGA CCGCAACTC3C CTGGACCCGT 



20 



30 



40 



50 



300 



25 TCACCAGCCG CCAGAGCCTC CAGC3CAYTAG CCTGYTATGY TGACATCTCT GTCTCTGAGG 360 



GGTCCGTCCC AGftGTCCGCA GACATGGATG TTGTACTGGA GTCCCTCAAG TGCCTCTGCA 420 
ACCTCGTCCT CAGCAGCCCT C3TGGCACAGA TGCTQGCAGC AGAGGCCCGC CTAGTGGTGA 480 
AGCTCACAGA GCGTGTGGGG CTGTACCGTG AGAGGAGCTT CCCCCACGAT GTCCAGTTCT 540 
TTGACrrGCG GCTCCTCTTC CTGCTAACGG CACTCCGCAC CGATGTGCGC CANAGCTGTT 600 
35 TCAGGAGCTC AAAGGAGTOC GCCTGCTAAC TGACACACTG GAGCTGACGC TGGGGGTGAC 660 
TCCTGAAGGG AACCXXXXAC CCACGCTCCT TCCTTCCCAA GAGACTGAGC GGGCCATQGA 720 
GATCCTCAAA GTCCTCTTCA ACATCACCCT GGACTCCATC AAGGGGGAGG TGGACGAGGA 
AGACQCTCCC CTTTACOGAC AOCTGGGGAC CCTTCTOCGG CACTGTGrTGA TGATCGCTAC 
IGCTCGAGAC CGCACAGAGG AGriTCCACGG CCAOGCAG?rA ASCCTCCTGG GGAACTTGCC 
45 CXTTCAflOrGT CTGGATGTTC TCXTCACCCT GGAGCCACAT GGAGACTCCA CGGAGTTCAT 
GGGACTGAAT ATGGATGTGA TTCGTGCCCT CCTCATCTTC CTAGAGAAGC GTTTGCACAA 
GACACACAGG CTGAAGGAGA GTGTAGCTCC CXTTGCTGAGC GTGCTGACTG AATCJrGCCCG 
GATGCACCGC CCAGCCAGGA AGrTTCCTGAA C3GCCCAGGTG CTGCCCCCTC TGCGGGATGT 1140 
GAGGACACGG CCTGAGGTTG GQGAGATGCT GCGGAACAAG CTTGTCCGCC TCATGACACA 1200 
55 CCTGGACACA GATGTGAAGA GQGTGGCTGC CGAGTTCITG TTICTCCTGrr GCTCTGAGAG 1260 
TOTCCCCCGA TTCATCAAGT ACACAGGCTA TC3GGAATGCT GCTGGCCTTC TGGCTGOCAG 1320 
GGGCCrCATC GCAGGAGGCG GCCCGAGGGC AGTACTCAGA GGATGAGGAC ACAGACACAG 1380 

60 
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ATGAOTACAA GGAAGCCAAA GCCAGCATAA ACCCTGrTGAC CGGGAGGGTG GAGGAGAAGC 
CGCCTAACCC TATCGAGGGC AIGACAGAGG AGCAGAAGGA GCACGAGGCC ATGAAGCTGG 
1GACXATGTT TCACAAGCTC TCCAGGAACA GAGTCATCCA GCCAATQGGG ATGAGTCCCC 
GGGCJTCATCT TACC3TCCCTC CAGGATGCCA TGTGCGAGAC TATGGAGCAG CAGCTCTCCT 
CQGACCCTGA CTCGGACCCT GACTGAGGAT GGCAGCTCTT CTGCTCCCCC ATCAGGACTG 
CyroCTGCTTC CAGAGACTTC CTTGGGGTTG CAACCTGGGG AAGCCACATC CCACTGGATC 
CACACCCGCC CCCACTTCTC CATCTTAGAA ACCCXn-TCTC TTGACTCCCG TTCTGTTCAT 
GATTTCCCTC TGGTCCAGTT TCTCATCTCT GGACTGCAAC GGTCTTCTTG TGCTAGAACT 
CAGGCTCAGC CTCGAATTCC ACAGACGAAG TACTTTCrTT TGTCTGOGCC AAGAGGAATG 
TCTTCAGAAG CTOCTGCCTG AGGGCAGGGC CTACCTGGGC ACACAGAAGA GCATATGGGA 
GGGCAGGGGT TTOGGTGTGG GTGCACACAA AGCAAGCACC ATCTGCGATT GGCACACTGG 
CAGAGCMANT GrrKTTQGGGT ATCTGCTGCA CTTCCCAGGG AGAAAACCTG TCAGAACTTT 
CX3VTACGACT ATATCAGAAC ACACCCTTCC AAGGTATGTA TGCTCTGTTG TTCCTGTCCT 
GTCTTCACTG AGCGCAGQGC TQGAGGCXTC TTAGACATTC TCCTTGGTCC TCGrTTCflGCT 
GCXXZACTCTA GrCATCCACAG TGCCCGAGTT CTCGCTGGTT TTGOCAATTA AACCTCCTTC 
CTACIGCTTT AGACTACACT TACAACAAQG AAAATGCCCC TCGTCTGACC ATAGATTGAG 
ATTTATACCA CATACCACAC ATAGCCACAG AAACATCATC TTGAAATAAA GAAGAGTTTT 
OGACAAAAAA AAAAAAAAAA AAAAAAAAAA AA 



X440 

1500 

1560 

1620 

i680 

1740 

1800 

1860 

1920 

1980 

2040 
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2160 
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2340 
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2432 



40 (2) INFORMATION FOR SEQ ID NO: 49: 

(i) SBQUOJCE CHARACTERISTICS: 

(A) liENGTH: 1742 base pairs 

(B) TYPE: nucleic acid 
45 (C) STRANDEENESS : doilble 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 
50 GTCCTGCAGG AGCTGCACGC GGCCGAGGTG CGCANGAACA AGGAGCAGCG 
ICGGGCTAAG GGCCCGGSAC GRGSGGCGCC CATCCTGCGA CGGAACACGT 
GTTTTGrrTTC GTITCACCTCT GrTCTAGATGC AACTTTTGTT CCTCCTCCCC 
CCCAGCTTCA TCCTTCTCIT COGCACTCAG CCGCCCTGCC CTGTCCTCGT 
TCACCAOGGC TTCCCCTGCA GGAGCCGCCG GGCGTGRAGA CGCGGTCCCT 
60 ACCAGGCCGG GOGOSGCTGG GTCCCCCGGG GGCCCTGrTGA GAGAGGTGGY 



55 



AGAAGAGATG 
TCGGGTrrTG 
CACCCCAGCC 
GGTGAGTCGC 
OGGTGCAGAC 
GGTGACCGTG 
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GTAAACCXIAG GGCGGTGGCG TGGGATCRCG GGTCXHTACG CTGGGCTGTC TGGTCAGCAC 420 

GTCCAGOTCA GGGCAGGTCC TCTGAGCCGG CGCCCCTGQC CAGCACX5CGA GGCTACAGTA 480 

5 

CCTCCTOTCT TTCCAGGGGG AAGGGGCTCC CCATCAGGRA GGGGCGACGG GGGAQGGGGG 540 

TCAIQCHGCC TGGGAAGCCT GCKTGTGCAN CCXSGTGCTTG TTGAACTGGC AGGCGGGTGG 600 

10 GTGGQCSGCTC CAGCTTTCCT TAATGTGGTT GCACAGGGGT CCTCTRAGAC CACCTGGCGT 660 

GAGCT3GACA CCCTGGGCCT TCCTGGAAGC CPGCAGTTQG GGGCCTGCCC TGAGTrCTGCT 720 

GGC3GAGTCGG CATTCTCTCC CAGGGACXXIA TGAGCAGGCT GCATGGTCTA GAGCSTTGTGG 780 

15 

GCAGCATCGA CAGTCCCXXIA CTCAGAAGTC CAAGAGTTCC AAAGACCCTC TGGCCCAGGC 840 

CXXrrCCGTOG GfiCAGCCXXG CCGCCCCTCC CCACCAGGGC TTTGCAGATG TCCTTGAAAG 900 

20 ACCCACCCTA GAGCCCTTTG GAGTGCTGGC CCCTCCTGTG CCCTCTGCCC TGGTGGAAGC 960 

GGCASCACAA GTCCTCCTCA GGGAGCCCCA AQGGGGATTT TKTO3GACCG CTGCCCACAG 1020 

ATCCAGCyrcr TOGAAGGGCA GOGGCTAAGG TTCCCAAGCC AGCCCCAACA CCCTTCCCAC 1080 

25 

. TTCGCACCCA GAGGGGGCTG TGGGTGGftGG CCTGACTCCA GGCCTCTCCT GCCCACACCX: 1140 

TCTCGGCTGA arXtXTTCTT TCCCTTQGAC GCCCAGTGCT GGCCTTGGAG GACGGTCflGC 1200 

30 IGGAGGAIX^G CGGTOGGGGA GGCTGTCXTT C3TACCACTGC ASCATCCCCC ACTTCTCCAC 1260 

OGAAGCCCCA TCCCAAAGCT GCTGCXrTGGC CCCTTGCTGr AAAGTGrTGAA GGQGGCGGCT 1320 

GAGTrCTCTT AGGACCXAGA GCCAGGGCCC TCAACTTCCA TCCTGCGQGA GGCCTTGGCC 1380 

35 

GGGCACTOCC AGTCTCTTCC AGRGCCACAC CCAGGGACCA CGGGAGGATC CTGACCCCTG 1440 

CAGGGCTCAG GGGTCflGCAG GGACCCACTG CXXXATCTCC CTCTCCCCAC CAAGACAGCC 1500 

40 CCAGAAGGAG CAGCCAGCTG GGATGGGAAC CXAAGGCTGT CCACATCTGG CTTTTGrPGGG 1560 

ACrCAGAAAG GGAAGCfiGAA CTGAGGGCTG GGATATTCCT CATGGTGGCA GCGCTCATAG 1620 

CXSAAflGCCTA CTGTAATATG CACCXMCTC ATCCACGTAG TAAAGIGAAC TTAAAAATTC 1680 

AATCAAAIGA ACAATTAAAT AAACACCTGT GrrGTTTAAGA AAAAAAAAAA AAAAAAACTG 1740 
CG 



45 



50 



(2) INFORMATION FOR SEQ ID NO: SO 



55 (i) SBCyJENCE CHARACTERISTICS: 

(A) LENGTH: 1487 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOIOGV: linear 

60 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

GGCACGAGCC TCCGCGAACT GTOGAGTCGG OGGAGGGCTG GAATCAGCGT GGGCTCCAGG 60 

TCGCTGGCAG CCGGGTQGCA GAACTCTTCC GAGGCTCCTT GGGAAGAAGC TACACCCGAG 120 

GGRGCCGGAT GGGCCTCGAA AACCTGGCCC GCTCTGGrrTC TGTACCATTG CAAGGGGAAC 180 

CGTAAACTGA GCTTTTCTAA CGTGGGTTTC TGCCAAGTAC TTTTCCAGCT GCCCCCTTCC * 240 

CCCCAGCACA CAGGAGAGCC TCrGTCTAGC CAGCGCTTGA CAGTCGTTAG GTAGGTTGTA 300 

CTGTCTAGGG AGGAGCTCAA GATCATGAAT GGTTGTCACA GGAGAAAGCG GTTGCATCTT 360 

TGCAAAACTA TATACCTGCT GrGGrTTGTG I ' lTl ' ClTrr C TGCTGAGTAA TGAAGTTGTA 420 

AGTTCACACT GGCACATTCT CAGGGCTGTG CAGATTATTT GCACTTTATT TCATAGGHPGR 480 

ATAAGTCCTT TTTAGCTrTC TTTGTATATT GAGTTGCTTT TCAATTQCTT CCCATATTTT 540 

TATTTCATAC AAACTGAACA ATTCrGGCCC CTCTATTTTA TTTATAAAGG TTCAGTGTAT 600 

CTTTGCCTGC CTACATCAAT CTGCAAGGGA GTTGCAGAAA GCCTCATGTT CATCGAQCCG 660 

TCAGTCACAA CCAATTTCTA AGCTGTTATA ACAAAAAAGT GTTTGCTrTT TTTCACAAGT 720 

AACTTTAAAA GTGTAGTTTA GAAAGAAAAC ATTTTCAATA AAAAGACACT ACATTAATCC 780 

TGGATGCrTG CAAATCCTAA AASTfTATrCC TCCTCTAGCG TTGCACAGCT CTGTGrnCTA 840 

TACACflGACT AGCTTTAAAA TTTGrTCACAT ACCACTTTAC CTTTACTTTT ATGrTATCATT 900 

CCCCCGACTT CCrTACTGCA GGTOTGGGCA AGAAAACTTT TCCTTTAACA CTTTTCAACA 960 

GCGGGCATAA AATTCTQCAG CTGAGGTCTT GAAGAATOCA GATGGGTACA GrTATGTGTTG 1020 

GAGCTCACAG TGTGTATTGA CTAACCTAGT TCCTTTTTTG CTTTTTTTGG TATTGTCTTG 1080 

TTAAAAGTGA CTCCCAGGTA GCAACTCTCT TTTETAAGGG TGQGAACGAA AQGGACGTAG 1140 

GAAGAATAGA TCTAGATTAT TTAACAGTCT TCGATAGAGT TTGAAAGCTT TCTTCTTCAT 1200 

TCAATTTTGG GCAAAATACT GCCTCTGCAT TPGTTCATAA CAAAAAGATT AGATTAATAA 1260 

GTAGCmTG TTGGTGGAAA TTACCAGCTC TATAAGTCAC CCTTGGrrGGT TCATOGACCT 1320 

CTGATTAGCT TQGGTrTTTGC AGrrCTCATTG CCACATGTAT ATGTGGAGCC AATOGCCTTT 1380 

TGGrOCTCAG CTGTTTACGT CTGACTCCTT GACTTCTTTG GTACAGTGAT GGAGTCAGAT 1440 

CTCATTAAGT GTGATTCTCC ATGGATATAA CCAQCCCCAA AAAAANG 1487 



55 



60 



(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1328 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDECNESS : do\±>le 

(D) T0P0LCX3Y: linear 

(xi) SEC^JENCE DESCRIPTION: SBQ ID NO: 51: 

5 

GGCACGflGCT CCrTGCCGAAT TCGGCACGAG AGAAGATTTG AAGAAGCCAG ATCCAGCTTC 60 

CCTGCGGGCT GCTTCTTGTG GGGA^GGGAA AAAGAGGAAG GCCTCTAAGA ACTGCACCTG 120 

10 TGGCXrrTGCX: GAAGAACTGG AAAAAGAGAA GTCAAGGGAA CAGATGAGCT CrCAACCCAA 180 

GTCAGCTTGT GGAAACTGCT ACCTGCX3CGA TCCCTTCCGC TGTGCCAGCT GCCCCTACCT 240 

TOGGATGCCA GCCTTCAAAC CTGGGGAAAA GGTGCITCTG AGTGATAGCA ATCTTCATGA 300 

15 

TOCCTAGGAG GTTCCTGACA TQGGACCCAT CTGCTCCTCC AGCCAACTCC TGTCCCTCAC 360 

ATCCCACCAT GC3TOGCTCCT CCCACCTCCT CTGGATTTGT TCACTCTGAG ATCTGTTTGC 420 

20 AGAGTGGGTG CTTAGCAGAC AGAGTGAAGC TGGCTGGGGG GCACAGTGCT arGTAGTGCT 480 

GCTGTCTATC AAAAGACCAA GGTATTATGG GACCTGGrrTT CAGAATGGGA TGGGTITCTT 540 

CACCTCATCT TAAGAGAAGG GAGTGTGTCC TGAAGAAGCC CTTCTTCTGA TC5TTAAAATG 600 

25 

CTCACCAGAA CGCTCTTGAG CCCAGGCATC GTTGAGCATT AACACTCTGT GACAGAGCTG 660 

CAGACCCCTG CCTTGAGTCT CATCTCAGCA ATGCTGCCAC CCTCTTGrTCT TTCAGAGTTG 720 

30 TTAGTTTACT CXATTCTTCG TGACACGAGT CAAGTGGCTC ACAACCTCCT CAGGGCACCA 780 

GAGGACTCAC TCACTGGTTG CTGTGATGAT ATCCAGTGTC CCTCTGCXrCC CTTCCATCCC 840 

CAACCACATT TGACTGTAGC ATTGCATCTG TGTCCTGTTG TCATTTATGT TAACCTTCAG 900 

35 

CTATTAAACT TCCTGCATAT CTTGACATAT CTTGAGAITC TQCATGrTCTT GTAAAGAGAG 960 

GGGATGTGCA TTTGrGTGTG ATGTTGGATA GTCATCCACG CTCAGTTTGG ACCATTGGAG 1020 

40 GAACTTAGrrG TCACGCACAA ATGGOGCTAT TCCTACGCTT AGAATAGGGC TTGTCTGCCC 1080 

ACTTTAGAAG AGTCCCAGGT TGGTGAGCAT TTAGAGGGAA GCAGGGCAGA ACTCTGAACG 1140 

ACAATACGTC TCTCTGAGCA GAGACCCCTT aXJiWlTCrTT ATCCACCCAT AOXSGACTTGG 1200 

45 

AATCAATCTT GCCAAATATT TGGAGAGATT GrrGrrOGATTr AAGAGACCTC GATTTTTATA 1260 

TTTTACCAGT AAATAAAAGT TTTCATTGAT ATCTGTCCTT G3\AAAAAAAA AAAAAAAAAA 1320 

50 AAACTCGA 1^28 



55 (2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LQJGTH: 1856 base pairs 

(B) TYPE: nucleic acid 
60 (C) STRANDEDNESS : double 
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(D) TOPOLCGY: lir^ar 
(XX) SEQUE^^CE 2ESCRIP7i;»ri: SEQ ZD VC: 
GAATTCGGCA CX^AGCTCTGC AAG^TTSC-A i^ZZ^-JkCTTC-I A 

ccTAGATTAA ATTCCCCGGG CTGAA.-.rra-. Grrrc-C^:^^.r: T 

TGCTGTCTTC AATTAAACCA TTTATa^-^CA T^.-^.-J^^TTT T 
TTTCCftGGCC TTCCTTCTXT GTAC^AA.-^ A^-.-.TC^rXLv: i 
TTCAAACATG ATGCTAATTT AAATT.-.-ITT.:^ CTTCCC^VTG.-. I 
TTTGCCACTG TTATTAGTTC TCTCAAA.-_-.r AC-.TCT^CG^ ^ 
TTTGATTATC TTTCTATCTC TTTTATTTAr TTCrTCAZTT?-. C 
GGTTGGCATT GATACAGTAA AlVrOTAAAT GAGGAGACA-. " 
TGTCCTTAAT 6ACtCTAGCA GAATSCrTTT TCrcrAA;^^^ A 
TAGTTTGATA GATTTGCAAG CTATGCTGCT TCa=C?GAA27 7 
CAGGCTTCTT TGTCTCIGGT TGCAGCTTOr ATSATCjCCC C 
CX3GAGATCAC AAA7CAGGCC CT?GGT:?rA3 TIGCIAST^T G 
GCAGAAACTG ACCICACTGG GCAAGGCnG-S CXATGC^CT 3 
GTGTTCAGGA AGCCACAGGC CACAITrXAr TCra--^AA3 J 
ACAAAGTATA ACAACCCCTT AAGATA--.rC TA-.-'-AA--^ ' 
ATACCATTGG CCAATTACAA GATAAAA-.TG TTCAArrTCr : 

TGrrcnrrcA tctcttccta rrrAXAmG TCAcrGrrA^ " 

GAGGAAGGAC TTIGCTGCAC TTACTGCACC ACATCAAACA ; 
CrmTAAAA AATGTTATTC TGATTAIAAT AATAAZATI^ I 
GCCACCTTGC AAGtHTTAGT GASATTT^CG GA^unrTX-AT . 
TAGCTCCAAA AATTTGCGAA QCAAAAGCrA GCCCCAATTG : 



^rCGP-jOGG? TCCX3CTGCCC 
ATA-.TATCA TATTTTA.-AT 
2A33ATGrC GATGCATGCT 
AASCSmC ACTTA-TATTC 
ATCrr^ATTA TTCCTATGAT 
A^P^GGATTA TTTTAAGT?A 
?^AGAAAJr TCGTTCCATT 

a:za?aaaat ctaaattact 

^ZTGTCTT TCTTGCAGTT 
lASrTGCGCT GGTAGGAACG 
S;C^J3GCA3 ACAACGTAGC 
r?sa-.GGTGC AGAGAGCTTTG 
^-TTCTTTAA TGCftCTCTAT 
-AAA2.AAGAG GAAAAACCCC 
:G.-AA'?rAAr TTTTCAarTr 
rZ.-AG.iATCC TTTGTTGACT 
?CA-^^^AACT CITATTTGCT 
rr3GGGftGG3 TGGTGTTTAA 



60 



TTAACAGATT TGCATTTGAA GTGAC?3CAS ACATTAGGTr 
AAAGfiGGAAT AAASACATC? YTTCTdCrA GAAAASATAA 
CCXACTTTCA TOl^ySATCAG CTTGrrcrGAT AACCTSaTAT 
TGATAATACT GGTACrTTTG TAA1'1V_GCT GCTTGC^OTTrA 
TTCAYCTTTT CTTCGAACAT YCCTAriCC? AGAIG^yGrTT 
AACTGTCCTA A lT l TiXJlTC TGTTACCC^A TCCCCCTT77 
ACAATTAAAT ATCACACTAT GACATATSAT TTAAGIAGa^v 



I^CTTTTTTCA TGAAAAGAGC 
iCriAAGCAfi GAATTGCTSC 

^ ■ .vvj^ gaagt ttgaaactga 
cas^-cattajg ttaaaaat.ag 
c:acc?jcaa:it aataatcctt 

3A3rrGTCAI3^ ATGATAAACA 
AX^^AGATAGT AAAKGATGAG 
CACCTCAAAT TGGGAATTAT 
GCTT7AATAC CCACAGTGTA 
CAZ^TAAAG ATAAATTITA 



60 
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600 
660 
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840 
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GGGCTAAATG TTTACTTCAA AATGACTCCA TATTTCAAAT ATCTGTTTAG ACTGTGAAGG 1740 
CCAAATAATT TTTAAGAAAA CATTTGAAGA GTAGTGTGrTT TGCATTTGTC AATAATCTTA 1800 
5 CTCACAGCAA GTAAACGTAA TAAAAGCCAA CATTTAAGCX: AAAAAAAAAA AAAAAA 1856 



10 (2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1558 base pairs 
<B) TYPE: nucleic acid 
15 (C) STRANDEDNESS : doiible 

<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

20 TGGGTATCCA TTCCTGNAAT TACTTTACTT AGGATAATGG CCTCCAGCTC CGTCCAAGTT 60 

GCTGCAAAAG GTATTATTTC GTTCCTTTTT GTGGCTGAGT AGTATTCCAT GGTGTATATA 120 

TACCACATTT TCTITATCCA CTCATTGCIT GATGQGCAGT TAGGTTGGTT CCACATCTTT ISO 

25 

GCAATTGTGA GTTGTCCTGC TCCAGATATC ATCTTTAACT CCTTTGCCTT CTCCACATAC 240 

ATTTCCAAGT CCTGrTTCATT CTACCTCCAA AATGTATCTT GTATCCATTC ATCTCTCTCC 300 

30 ATCTTCAATC TATTTCAATG CCCCATCATC TCTTC3CATGG AGGAGTGTAA TAATTGGCTA 360 

ACTCGCCTCT TCTTACATTT TAAAATCAAA AGATGTGACA QGTGAAATGC CTATTTCAGT 420 

GTCCATTGAT GGrTTCTGCTT ACACACCACC TGGCTGCCTG GrGTCGCAGT GGCAGAGTTG 480 

35 

AGCAGTOIGA AAAAGACTGC TTGGCCCTTT ACAGGGAAAG CAGGTCCACT CTQGCCTGTG 540 

AGGACGAGAG CTCTGQGCAG GCTOGGACAC TQGCAGACCC TQGTCCTGGC TGGCCAAGGC 600 

40 AGCAGGGTAT GTGTTTCGGG TCACTCACAG GGCTCAGCAC CACPCCTCAT GGCTTCCTTA 660 

CTCTITOGGC AGAGGCTGAC CCGCGGCTGA TPGAGTCCCT CTCOCAGATG CTGTCCATGG 720 

GCTTCrCTGA TGAAGGCGGC TGGCTCAOCA GGCTCCTGCA GACCAAGAAC TATGACATCG 780 

45 

GAGOQGCTCT GGACACCATC CAGTATTCAA AGCATCOCCC QCCGTTGrTGA OCACTTTTGC 840 

CCACCTCTTC TGCGTGCCCC TCTTCTGTCT CATAGrTTGTG TTAAGCTTGC GTAGAATTGC 900 

50 AGGTCrCTGT AOGGGCCAGT TTCTCTGCCT TCTTCCAGGA TCAGGGGTTA GGGTGCAAGA 960 

AGCCATTTAG GGCAGCAAAA CAAGTGACAT GAAGGGAGQG TCCCTGTGTG TGTGTGTC3CT 1020 

GATGmrrccT GGGTGCCCTG GCTOCTTGCA GCAGGGCTGG GCCTGCGAGA CCCAAGGCTC 1080 

55 

ACTGCAGCGC GCTOCTGACC CCTCCCTGCA GGGGCTACGT TAGCAGCCCA GCACATAOCT 1140 

TGCCTAATGG CTTTCACTTT Ci ll CTi TCTT TTAAATGACT CATAGGTCCC TGACATTTAG 1200 

60 TIGATTATTT TCTGCTACAG ACCTGGTACA CTCTGATTrT AGATAAAGTA AGCCTAGGTG 1260 
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TTGTCAGCAG GCAGGCTQGG GAGGCCAGTG TTGTGGGCTT CXrTGCTGGGA CTGAGAAGGC 
TCACGAAGGG CATCCGCAAT GTTQGTTTCA CTGAGAGCTG CCTCCTGGTC TCTTCACCAC 
TGTAGTTCTC TCATTTCCAA AOCATCAGCT GCTTTTAAAA TAAGATCTCT TTGTAGCCAT 
CCTGTTAAAT TTGTAAACAA TCTAATTAAA TQGCATCAGC ACTTTAACCA AAAAAAAAAA 
AAAAAAAAAA AAANAAAAAA AAAAGGGGGC CGCTCTAGAG GTCCAAC?rTA NSACGNGG 

(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 948 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOIOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 
TAAAAATCAT GCTCTGTACC ATCCTCACCG TAGTCATCAT CATCGCCGCG CAGACCACGA 
GAACTACTGG GATCCCTAAA AACGCCCCTG GTCCGGCCCC ACTCTGOQCC CCTCGATCTC 
CCAGGCTCTT TCTGCAGWCA TACCGCGGAC CCAATGGGCG CCCTGCACAC CCGTTTCTQG 
GGCCGTCAGA CTTGGATACA TCGTAAACTC OGCCTCCACG GAACGTCTCG CCTKGCGAGC 
AACSMTCGGAA TCCAGTTCCT CAGGAACCCC TCCAAAACCC ACACCCCCAG GGACGCOGCT 
TTCCGQQATC COGGSCAAAC GCCGGACCCT CAGTCGCTCC AGGCCCCCTC ACCCTCAAAG 
TGTAGCGCCC CCAACCGAGC AACCTCGGTT TGGTCCCTAA AACCCCGCCT CCTCTATAAG 
CACCGCCCCA GCTCIGACAA AACCCCGCCT CCAGGTCGGC AGGCTCCGCT TCTTTTCTTC 
TCCGOGQGGT GATTCAGTCC AGTGATPQGG TTTGTGGCTC CAGGCCTCGC CCACAGACGG 
ACAGACCCCT COCTTTCTTC CGGCAAAAGG ACCGAGCCCT GGGGTAGTAA GGSCCCCACA 
CTCCTGrTTT TTGCAAGTAC AXTTTTGTCC YTCCTCCACC CftGGTATCTG CCTATTTPCT 
TGCTAATCCC AGAACCTTTC CTTTTGCTTT TTTTAAGGAC ATTTGGGAAG TTCCTGGTOT 
AQGACCCTTC TCCCTGGGAT AAGAAACCTG CCTGTAAACG CTCTGTAAAT ACTCCCTTCC 
ACCCATCCCA GCCCCTGGGC AGOCGGGCAG AAGGGAATCC AGGCTATGGA CCTCCCAAGT 
COCCGCTCCC CGCTCCCCTC GGCGGCCCOG CCTTGTTCTG ATCTGTGTGT GAGTGTOTGT 
GAACTTCTGA AAGACAATAT TAAAGAGACT TAGTTGAAAA AAAAAAAA 

(2) INFORMATION FOR SEQ ID NO: 55: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LEtJCTTH: 990 base pairs 

(B) TYPE: nucleic acid 

5 (C> STRANDEDNESS: double 

(D) TOPOUX3Y: linear 

(xi) SEQUENCE DESCRIPTICN: SEQ ID NO; 55: 

10 GQGGAACK3C AGTGACAGCA GGAGTAAGAG TGGGAGGCAG GACAGAGCTG GGACACAGGT 60 

AaXXSAGAGGG GGTTCAGCGA GCCTAGAGAG GGCAGACTAT CAGGGTGCCG GCQGTGAGAA 120 

TCXAGGGAGA GGAGCGGAAA CAGAAGAGGG GCAGAAGACC QGGGCACTTG TGGGTTGCAG 180 

15 

AGCCCCTCAG CCATCSTTQGG AGCCAAGCCA CACTGGCTAC CAGGTCCCCT ACACAGTCCC 240 

QGGCTGCCCT TGGTTCTGGT GCTTCTGGCC CPGGGGQCCG GGTGGGCCCA GGAGGGGTCA 300 

20 GAGCCCGTCC TGCTGGAGGG GGAGrTGCCTG GTGGTCTGTG AGCCTQGCCG AGCTGCTGCA 360 

GGGGGGCCCG GGQGAGCAGC CCTGGGAGAG GCACCCCCTG GGCGAGTGGC ATTTGYTGCG 420 

OTCCGAAGCC ACCACCATGA GCCAGCAQGG GAAACCGGCA ATGGCACCAG TGGGGCCATC 480 

25 

TACTTCGACC AGGTCCTGGT GAACGAGGGC GGTGGCTTTG ACOGQGCCTC TGGCTCCTTC 540 

GEAGCCCCTC TCCGGGGTGT CTACAGCTTC CGGTrCCATG TGGTGAAGC3T GTACAACCGC 600 

30 CAAACTGTCC AGGTGAGCCT GATQCTGAAC ACGTGGCCTG TCATCTCAQC CTTTGCCAAT 660 

GATCCTCAGG TGACCCGGGA GGCAGCCACC AGCTCTGrTGC TACTGCCCTT GGACCCTQGG 720 

GACCGACyrcr CTCTCCGCCT GCGTCGGGGG NAATCTACTG GGTGGrTTGGA AATACTCAAG 780 

35 

TTTCTCTGQC TTCCTCATCT TCCCTCTCTG AAGC5ACCCAA GTCTTTCAAG CACAAGAATC 840 

CAGCCCCTGA CAACTTTCTT CTGCCCTCTC TTGCCCCANA AACAGCANAA GCAGGANANA 900 

40 NACTOCCTCT GGCTCCTATC CCACCTCTTT GCATQQGAAC CTGTCCCAAA CACCCAAGTT 960 

TAAGAAAAAA ATAAAACTGT QGCATCTCCA 990 



45 

(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH: 1603 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: double 

(D) TOPOLOGY: linear 

55 (xi) SEQUENCE DESCRIPTION: SBQ ID NO: 56: 

GGTCGACCCA CGCGTCCGGC CCGCCGGCTC CGGAGCGGCT CTGCCTTCCC GAGCGCGGGA 60 
CCGCGCCCTC GGGGAGGAGG GCGAACGACG OGGCGATGGC TCCGCGGQCA CTCCCGGQGT 120 

60 
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CCGCCGICCT AGCCGCTCCT GrTCTTCGTOG GAGGCGCTGT GAGTTCGCCG CTGGrTGGCTC 180 

CGGACAATGG GAGCAQCCGC ACATTGCACT CCAGAACAGA GACGACCCCX; TCGCCCAGCA 240 

5 ACGATACTCG GAATGGACAC CXZAGAATATA TTGCATACGC GCTTGTCCCT GrrGTTCTTTA 300 

TCATGGGTCT CTTTGGCGTC CICATTTNGC CAMCTNGCTT NAAGAAGAAA GGCrATOGTT 360 

CTACAACAGA AGCAGACXIAA GATATCGAAG AAGAAAAAGG TTGAAAAGWT AGRATTGAAT " 420 

10 

GACAGTCTGA ATGAAAACAG TGACACTGTT GGGCAAATCG TCCACTACAT CATGAAAAAT 480 

GAAGCX3AATG CTGATGTYTT AAAGGCGATG CTAGCAGATA ACAQCCTGTA TCATCCTGAA 540 

15 AGCCCCXTTGA CCCCCAGCAC ACCAGGGAGC CCGCCACTTGA GTCCTGGGCT TTGTCACCAG 600 

GGGGGAOGCC AQGGAAGCAC GTCTGrPGGCC ATCATCTGCA TACGC?rGGGC GGTGTWGTCG 660 

AGAGGGATGT GTGTCATCGG TGTAGGCACA AGCGGTGGCA CTTTATAAAG CCCACTAACA 720 

20 

ACTTCCAiGAGA GAGCAGACCA CGGCXXXZAAG GCCSAGGTCAC GGTCCTTTCT GTrGGCAGAl* 780 

TTAGAGTNAC AAAAGTGGAG CACAAGTCAA ACCAGAAGGA ACGGAGAAGC CTGATGTCTG 840 

25 TTAGrTGQGGC TGAAACCGTrC AATGGGGAGG TGCCGGCAAC ACCTGTGAAG AGAGAACGCA 900 

GTGGCACAGA GTAGCAGGTG AGCCGTOGTT TTGGTCACAT TGQGGGCAGA GrTGGTGCAGG 960 

GTCAGGAGAA GGTACTTGGA GCCTCCCflGG TGCTGTQGCA GCATAGGAAT GGTATTTGAC 1020 

30 

AGGGAAGTOG GAGAGCTTrC CTTGACCCAG GAAGACTGAG GGGGftCTGAA CATGATTACT 1080 

TGrCTGCCTA GAGCTTCTTG TAAAGAAGTC ACAAACTTAG TGCCTCCAGG GGCTTGGCTG 1140 

35 TGTGATAATC AGGATAGAGG ATTACTTGrG AGGCAATGTG GCATGGTGGG GATTC3TGGCA 1200 

AACTAGAATT CACATCACCC ACCATATAGG GCTTGCATTA CCACGAGGCA GAAAGCACCT 1260 

AGTCTTQCTG CATCTTCTTA CGCAAAAAAG ACAAAATCCA GACTTCTAAA ATGTAAAATC 1320 

40 

ACTGATTTTC GATATTGGCA GCrTACmT TTTITTTAAA CAACCATGCA GGCCAAATGA 1380 

CTTOTAATCT TGrrCACCATT TTTAGGTAAA CTGTGACTTG AAAAAGTCTG GAGCAAACAA 1440 

45 ACCAATCCTT TTTCCTTTTA TTCTGTTGGR AACCAGnTT GTTTGTGTCA CAGrmTGAA 1500 

ACCTCAATAC GAATATTTCT CTTCCCACCA AATATTTTGA GGCAATTGAA AAGCCACAGT 1560 

GATTTATTTC TTGATTTGGC AATTTTAATT TTGCAAGACA ATT 1603 

50 



(2) INFORMATION FOR SEQ ID NO: 57: 

55 

(i) SEQUENCE CHARACTEiaSTICS: 

(A) LENGTH: 1052 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
60 (D) TOPOLOGY: linear 
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20 



25 



30 



35 
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(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 57: 

TACAGCTCAG GATGCCTGTA ACATTGTCAT CTCTC5GGCTT CTGGGTCCTG CTTAGCCTGC 60 

TTTTTCCCTG GAGGACTGAC CAGGGATGCG GCCCAGCAAC ATGTTACTAA ATCATACTCT X20 

CCrCCCTACC TTTCCCAGAC CTCTCACTCC TGCCTGGTGT TCCAACCCC3T TCTGTCGCCA 180 

GAGTATACAT TTTGGAACCT CTTCGAGGCC ATCCTGCAGT TCCAGATGAA CCATAGCGTG 240 

CTTCAGCAGN AAGGCCCGAG ACATGrrA3X5C AGAGGAGCGG AAGAGGCAGC AGCTGGAGAG 300 

GGACCAGGCT ACAGTGACAG AGCAGCTGCT GCGAGAGGQG CTCCAAGCCA GTGGGGACGC 360 

CCAGCTCCGA AGGACAOGCT TGCACAAACT CTCGGCCAGA CGGGAAGAGC GAGTCCAAGG 420 

CTTCCTGCAG GCCTTGGAAC TCAAGCGAGC TGACTGGCTG GCCCGTCTGG GCACTGCATC 480 

ACCCTGAATG AGGCTGGCCA CCTGCCACTT TGCCCTGCCC TCTGCCTCCA GGGCTCCMCT 540 

MyCCTTCCTT TTCTTGCrTGA AAGGCACCTC CTTTCCTGAT AATGAATGGT GriTCCCTTTG 600 

CTTGGCTOGG GAGCCCCCCA GGCCAGGTTT GCTQGCCATA GATACCTTTG GGCTGCCTGR 660 

GACAGGCTCC TGAGGAGGAT TGAGGGTGAA AGTCTCCCAC GAGTACACTA AACCTAGGTC 720 

TGGTCACCAA TAGQGTTPGG AGAGCAAAGG GCCACAACTC ATCAGCTGCC TGTCTCTTAG 780 

ATGCACTTTC TTTTTCCACC AGCACATCCT TCAACACACA GAATTTCAGG GAAGAGTTCT 840 

CCCCAAAACC CTAGCTCTTT ACCCTTCCAT TTTAGCCTTC CACCCAGCTT CCACAAAAGA 900 

TTTGGCTCTA CCTTGGATCT GCTAGTAAAT AACTAATAGG CAGGCAGTTA TTTGGGTAAG 960 

GAAAAAAGGG GTGQGAGAGA CAGAAAATTT GCCCACTGCT GCTCCTCCCC TTGGSTYTCC 1020 

ACCTGGGATT TGCTATTGAA TCTCTACCCT NN 1052 



40 
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(2) INFORMATIGN FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) L0IC?rH: 814 base paixs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: doxible 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 
ACNCGNTGGC GGCCGCTCTA GAACTAGGGG ANCCCCCGGG CTGCAGGAAT TCGGCAOGAG 
CATAGACTTT TAAACTGGTA CGGTTCTTAG AGATGGTCCT TGGCCTTCTG TTGTPGrTGT 
KGTTITTTTC TTTrTCTTCT TCTCCTTCTC CTTCTTCTTC TCTTCTCCTT CTTTCTTCTT 
TTnTTTTCA GAGTCTTGCT CTGTCACCAA GACTGGAGTO AAGTGATGTG ATCTCGGCTT 



60 
120 
180 
240 
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ACTGCAACCT OQGAGGCAGA GGTTGCAGTG 
GCAACAAGAG TGAAACTCTT GTCTCAAAAA 
GTCATTACTG GTGGGATCTG GTCACACAAG 
TTGGTTAAAA AATTTTOTTT TTTAATTACG 
AAGATTGGftA TGTATCTTCA AATTCAGATT 
AAAGTTGTAT TTAATCCXOT GTGCCCCAAG 
TATGAAAAGA TAGCAATAGG GAATQGTGAA 
CATGG3VCTTA AACCCCATGA AAACTTQGTT 
ACAAAACCAG AGTGGTTrAC ATTCCACAAT 
TTTNGGTATT TGCCATGGGA TACTATTCAT 



318 

AGTCGAGATG GTGCCATTGC TCTCGTTTGG 
AAAAAAAAAA ATGAGGTTTA AGACAGTrTT 
ATAGCATTAA ACGTGACATG GCACATAAAA 
TAATGTAAAA GCCCAACAAA CACTTTATGC 
TAATAAACAT GTAAAGATCC TCTGTATATA 
AATGCTATAA AAGATCCCAA GAATGTTATC 
CAAATAATTT AATTTGCCAA TTCTAAAAAA 
CCATAGTTTT AflCTGTTTTA TGGTTCCAAT 
NACCAAATTT GCATCCAATN TTGGGGTAAT 
TTTT 



(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTEEUSTICS : 

(A) LENGTH: 1215 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: double 

(D) TOPOIjOGY: linear 

{xi) SEQUENCE DESCRIPTICN: SEQ ID NO: 59: 
AGAGGAAGTC TTTTGCCAAG CCTGTTCTCT GGACTAACGC CATCCAGGCT GGGAGGGGAA 
GAGTQCTCTG CTACACTCGT CCCCCTCCTG CCTCATCTTC CTTCTCAGCC TTGCri'VCU^lX; 
ATGGGAACAG AATQGAGGGC CTGAGAACAT ACTTTCTAAA TGCCTTTGAC CCAGGAACOG 
ATTATCTATA TTTGTTCCCA TrTTCCTTCA CCGTGACATT CCAGCATTGT CTGACTGrrGA 
GGTGGGCCTT TGAGAGCCTC CAGGTTCCTC AAAACAGGCC TGAGCGATGG GCATCACACC 
CTCTQCCTAC CCACRTGCCT GCTTACCPGC CAGATAACCA AGTOJAGATG TCTQCGAGTG 
GCTAGTTTTC ACATTCTTAC TAGTCTTTGG YTCACCTTTG GGCAAAGGOC CCCTCTAGGC 
CTTGCCCCAC CTCCATCAAA CGCAGACACT GTAGTCAGAC CTCAGYAATA TAGGAGGCAA 
TAATCTTTTA ACAGTGTTTT GCAAACAAAC AAAAAGAGAA AAATCCCAGC CAGGGGAACT 
CGCCACCTGC CCACGCTAGT TCCATCCACG CTCAAGACCC GCCCTTAGAC CAGGCAGGCA 
AAGGCCCCCA TCACACTCGG CCACTAGTCG GGTCCTGAGG CCAAGAAAGA AACCAGACOC 
TGTATGACAA GITGGGKTCT TTCCAGAACA CGACAGAAAC AGGGGGGGCC CCTTGTTAAT 
GCCACTCCAT ACTCCAGAAG CATTATTCCT TATTTGGGAC AGCCAAGGGC AGATTCACAG 
GTTATTGTAG GAATAAAGAC TAGTTTACAA AGGARAAAGA GSCCCTGGAC TTCCCMAGGA 
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15 



AAGGTCAGGT TAGGGCICCT GTACCCATTC 
CCCACCAGGA ATGCCGTTTC CTTTTTATGG 
TCAAIGACAT AGGATCCGAA GTCSCAATGAT 
CTGNACAGCA AGGTATTGGT AGGTTACTCA 
TTAGGAACCG CTGTTTCNAT TTCTTTTTTT 
GGCTTTCGGA ATTCCTGCAG GAAAGAAATC 
AAAAAAATAG ACTOG 



TGTTCCACCA CTGTTTGATC TCTCTGGCCT 
ATCTGTTGGG AACCAGAGAG AATCAACAGA 
AGTCACITCT AGTTTGGCAT TTCACAAACT 
ATTTCAAAAG GGCCCCATGG CCAAATATGT 
GGAGACGCAT TGTATATAAT ATATGTCAAA 
AGCTTTGTTA AATCCmAAA AAAAAAAAAA 



900 
960 
1020 
1080 
1140 
1200 
1215 
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(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 478 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECNESS : double 

(D) TOPOLjOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO 
ATTTCTTATG ACATGGQGGT TTGAATTGGT TGGCAAATGT 
CAGTGAGGTC CTGCTGGCTG TAATCATTAA TTGTGAAATC 
TCTAGAATTT CACAGAAAAR TGYGOTATGA TACGAGCATT 
TTGATGCAGC TTTGTTCAGT TTATCTGTTT TTGTATTTAT 
CAAAAGGGAC TGGTCTACAT AGCCGCGCTA AACACCTGAT 
TGTTACCTCT AATGAATTAT CCTGATTGTA AGTTAAAAAT 
GTTTGCTTTT TAAAAAGAAK KdTAAAAAA AAAAAAAAAA 
AGCAAGCTCA GGTAAGGTGC ACACATTQGG CTAAGGAAGC 



: 60: 

TTAATTTTAA TATCCATAAT 60 

TAAGGAGCTT AGTTCATGGC 120 

AAGTTTATTT CTTCTGATCT 180 

TGGTCATCTA CTTCCCATGC 240 

CAAATCACTA AAAGAAAATG 300 

CAATATTTCC CCGTAGTCAG 360 

AAACGAGTTN AAGAAAAGGA 420 

TAGAGCCTGT QGAGANGC 478 
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(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 618 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 
TATGACCTTG ATAACCCCAA GTTNGAAATT AACCTTCANT AAAGGGAACA AAAGCTGGAG 
TTCGOQCGCT TGCAGTTCGA CACTAGTGGA TCCCAAAGAA TTOGGCACGA GTCATAATGA 



60 
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GCTACTAGGT AAGCCTTCTG GGACTTTCAG ATATTTTGGG GAAGA.TTGAT T i TlXJlU ^ -'iT 
ACA1X3CTGTG GACCCTTGGC CATCAAATGG TATGGGGAAG CTCATCCGTC TCTCTOTGAT 
GGTCATGTCA GTCAGGCGTC TTTTTAGTAT TTACTGGGTG CTCAGTACTG TGCCAGATGC 
TGTCGGGAGC CGTGGTQGTA TGGAGGAGGA GTGCTCCAGA GGACTCTGCT GTGTGGCAGG 
CCAGCATAAA CAAGCCAAGG GGAAAAGGCA GGCATGGAAT AAAGGGGGAG AATACCAGTG 
TGrTGACTTAC TGCTGACTGT GTGGATTAGC CTATCAGCAG TAATCAAGCA GGGCGGAGGG 
CATTATCTTT GAGCCAGAAG AGTGAGCACT GGSCCGAGGG TQGAGCATCA AGAGGGGGTG 
TAGGACCNCA AQGCTTCrm CNGGGGAGAC AACX3TCAATA AGCNGTCAGT AGTCACOGAC 
AGTTTTC3GGA AGCAAC3QG 

(2) INF10RMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHAElACIERISriCS : 

(A) LENCTTH: 751 bcise pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY-: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 



TCGACCCACG 
TCPGCTGCTA 
ATGGCCCTGA 
CTAGTCACTC 
CTACAAGGAG 
TCTGATGGAA 
TATTGTAAAA 
TGAGATGAAT 
TGATCACAGC 
GGrrAAACTGG 
CTCAGmTGT 
TTCTATCACA 
ACACAATTAC 



CCyrCCGAGGA 
CAGCTCATAG 
TCACCCTCAC 
ACTTCTAACA 
ACTACGATGC 
GCCAGTTGCC 
AGCCTCTGAC 
CCTGCCAACC 
CACCACCAAC 
ACAGAATCCT 
TACAGAGCAA 
CAAACCAGGT 
ATGTGATTTT 



GCTGGACTTC 
AAGTCAACAA 

ANGAGAATAC 
CTQCCTTGGT 
ATGTGATGAG 
CAATAGCCAT 
TGAGCTPGGA 
ACCTTCACTG 
GACCCACAGA 
TAGATAACTA 
AATACCAAGT 
TTAAGAAGGC 



TGAGACAGCC 
TTTTCTTCAA 
TCACACCNm' 
AGCAAAAGTA 
CACOCTTCTC 
GTGCCCTATG 
CTAGAAACQG 
GACAGATTCT 
CCTGGTGAGA 
AACT3AGATA 
ACTCAAACAC 
AAATGCCATT 
T 



ATTCTCCTTG 
CACTGGTAGG 
GTAAAATTCC 
ACATCGCTTC 
CTQCTCTTTC 
GAGAOGCCCA 
AGGCCCAGTC 
CTCCCTATCC 
GGCCAAGCCA 
ATGTTTGTTA 
CATAAAATTC 
ACTATACACA 



CATAGCACTG 
CAGCCTCTAA 
ACCCCTGGAC 
TGAOGTGAGG 
CATTGCTCCC 
OGTGACAAGG 
CAGCAGCCTC 
TGCCTTGGGA 
GTGAACCCAA 
TTTTAAGCTG 
TAATATTTTA 
TATTTTTGTA 
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(2) i:3^4ATrC»J FOR SEQ ID NO: 53: 

(i) S3QCH:ICE C-:;:J>ACrERI3TICS: 
5 (A) LH^XTTH: 780 base pairs 

(3) T/PZ: nucleic acid 

(C) STE?-ZiECfIES5 : double 

(D) TCPCLCG*/: linear 

10 (xi) S5CUHNCE CESCRIPTIOM: SEQ ID NO: 63: 

CNCSXS-iJrCA CX3TCCCCGA TtCCCGGGTC GACCCAOXX; TCCGGC7ITGG CAACTCCTGA 60 
GGCCIGCA-TG GGTGACrrCA CATTTTCCTA CCTCTCCTTC TAATCTCTTC' 'fAGAGCACCT 120 

15 

GCTATCCCC^ ACTTCTAGAC CDGCTCCAAA CTAGTGACTA GGATAGAATT TGATCCCCTA 180 
AcrcAcrK?rc TGCGCTTGCTC AITQCTGCTA ACAGCATTGC CTGTGCTCTC CTCTCAGGGG 240 
20 CAGCATGCTA ACGGGGCGAC GTCCTAATCC AACTGGGAGA AGCCTCAGTG GTGGAATTCC 300 
AGGCACTSTG ACTGrrCAAGC T3GCAAGGGC CAGGATTGGG GGAATGGAGC TGGQGCTTAG 360 
CTCGGAGCJZG GTCTCAAGCA G?£AGGG?AT GGGAGAGGAG GATGGGAAGT AGACAGTGGC 420 
TCGTA=*3GCT CTGAGGCTCC CIGGGGCCTG CTCAAGCTCC TCCTGCTCCT TGCTGriTTC 
TGAI^=A!mG GC3GGCITCGG A3TCCCmG TCCTCATCTG AGACTGAAAT GTGGGGATCC 

30 Asa^TscsccT TccrrorrcT -rAccxTrrccT ccctcagcct gcaacctcta tcctggaacc 

TCTCCTCCCT T'ICTCCCCAA CTATCCATCT GTTGTCTGCT CCTCTGCAAA GGCCAGCCAG 
CTTGGGAGCA GCAGAGAAAT AAACAGCATT TCTGATGCCA AAAAAAAAAA AAAAAAAACC 720 
GCGGCCQAAA GCTTATTNCC CTTTAAGTAA GOGGrTTAATT TTTAGCTrGG GCACTNGGCC 780 



(2) I^S'OP^QTrON ?OR SEQ ID NO: 64: 



(i) S3QUH5ICE CHA3ACTE31ISTICS : 

(A) LENGrrH: 588 base pairs 
45 (3) TYPE: nucleic acid 

(C) STRAMDEENESS: double 

(D) TOPOIjOGY: linear 



480 
540 
600 
660 



(xi) SBQfUHNCH DESCRIPTION: SEQ ID NO: 64: 
TTCCSiJ^A A3XX3CTCAC TATAGGA^JflTT GCCC?rCGCCA TGACCCXXX3G TAACCAGCGT 60 
GAGdOSCCC GCCAGAAGAA TATGAAAAAG CAGAGCXSACT OGGTTAAGGG AAAGCQCCGA 120 
55 GAT^==OGGGC TTTCTGCTCC C3CCCGCAAG CAGAGGGACT OSGAGATCAT GCAGCAGAAG 180 
C^JSAAAAAGG CAAACGAGAA GAAGGAQGAA CCCAAGTAGC TTTGTGGCTT CGTGTCCAAC 240 
CCXTTCCCC TTCGCCTGTG T9CCTGGAGC CAGTCCCACC ACGCTCGCGT 

60 
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AGTCCTCACA GGTCCCAGCA CCGATGGCAT TCCCTTTGCC CTCyVSTCTGC AGCGGGTCCC 
rnTOTGCTT CCITCCCCTC AGGrTAGCCTC TCTCCCCCTG GGCCACTCCC GQGQGTGAGG 
GQGTTACCCC TTCCC7CTGT TTTTTATTCC TGTQGQGCTC ACCCCAAAGT ATTAAAAGTA 
GCTTTCTAAT TCCAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
AAAAAAAAAA AAAAAAAAAA AAAANNCGGG GGGGGGCXXX: CCCCCCXX: 

(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 774 base pairs 

(B) TYPE: nucleic acid 
iC) STRANDEDNESS : double 
(D) TOPOLOGy: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

TTTAAAGATG AAGAAATGAC AAGGGAGGGA GATGAGATGG AAAGGTGTTT GGAAGAGATA 

AGGGGTCTBIA GAAAGAAATT TAGGGCTCTG CATTCTAACC ATAGGCATTC TCGGGACCGT 

CCTTATCCCA TTTAATTAAT TTCTCTGACA ATTCAATTAT TrTCTGTTAT TAATGTTGCC 

ACTGCTTTCT GTTTGTCTGC ACTTTCTTGA TAAATATTTG CTATCOmT ACTCCAGTCA 

TTCGATCTTG CTCAGATTTA CATATGACTC TTGTCAACAT CTCATCTTTT GACCCAATCT 

TATTCATTTA ATAAGAGGTC TCATTCATTT GCATGGAAAA ATGCTCATTG TATATTGCAA 

AGTGAAAATA ACGAGTTGCA AAACAGTGTA TACATATATG TGrTGTATATA TGTACACTTT 

ATrTCTACAT TTCTATGTGA CATAATGCAA AGGAAAGTGT CTGATTTTAT TATACACCAA 

AQGTTAACAG TGAATCTCTG TGTGATCTCT TTTTTTTTCT TTTTGCCTAT CTGCATCTTC 

TCACTIGCCA AAAAATGAAT ATATGTTTAT GTGTGTATAT TACTTGTGTC ACAAAAAACC 

CTAAAGTAGA CAGTAAAAGA ACTTGTCAAT CGCCTTTGGA AGGCAATGAA ACACTTAATA 

AACTCTCAAT AACAGAAGCG TAAAAATGAA ATGTAAACCT CCAATTACCT CTGGATCTCT 

TAGCCAGAGT AATAAACTGG TAATTATTAC AGATAAAAAA AAAAAAAAAA AANA 

(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1866 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

ACCCACGCCyr CCQGTCCTCT TCTTCAGCAC ATGCCAAAGC TGTTCCrCAC GGCCTGTGAG 60 

ACAAGAGCAT dTGGATGTA GGACAATGGA AGAGTTAGAT GCCTTATTGG AGGAACTQGA 120 

ACGCTCCACC CTTCAGGACA GTGATGAATA TTCCAACCCA GCTCCTCTrC CCCTGGATCA 180 

GCATICCAGA AAGGAGACTA ACCTTGATGA GACTTOGGAG ATCGTrPCTA TTCAGGATAA . 240 

CACAAGTCCC TTCCCGGCGC AOTOGTGTAT ACTACCAATA TCCAGGAGCT CAATGTCTAC 300 

ACJTGAAGCCC AAGAGCCAAA GGAATCACCA CCACCTTCTA AAACCTTCAGC AGCTGCTCAG 360 

15 TTGGftTCAGC TCATCGCTCA CCTGACTGftG ATGCAGGCCA AGGTTGCAGT GAGAGCAGAT 420 

GCTGGCAAGA AGCACTTACC AGACAAGCAG GATCACAAGG CCTCCCTGGA CTCAATC3CTT 480 

GGGGGTCTSG AGCAGGAATT GCAGGACCTT GGCATTGCCA CAGTGCCCAA GGGCCATTCT 540 

GCATCCIGCC AGAAACCGAT TGCTGQGAAG GTGATCCATG CTCTAGGGCA ATCATGGCAT 600 

CCTGAGCATT TTGTCTGTAC TCATTGCAAA GAAGAGATTG GCTCCAGTCC CTTCTTTGAG 660 

25 CGGAGTGGCT TGOJCTACTG CCCCAAOGAC TACCACCAAC TrTTTTCTCC ACGCTCTCCT 720 

TACTGCGCIG CTCCCATCCT GGATAAAGTG CTGACAGCAA TGAACCAGAC CTGGCACCCA 780 

GAGCACTTCT TCK3CTCTCA CTGCGGAGAG GrrGTTTGGrPG CAGAAGGCTT TCATGAGAfiG 840 

GACAAGAAGC CATATTGCCG AAAGGATTTC TTAGOCATGT TCTCACCCAA GnCTGGTGGC 900 

TCCAATCGCC CAGTGrrTGGA AAACTACCTT TCAGCCATGG ACACTGTCTG GCACCCftGAG 960 

35 aXSCTITCTTT GTGQGGACTG CTTCACCAGT TTTTCTACTG GCTCCTTCTT TGAACTQGAT 1020 

QGACCTCCAT TCrGTCAGCT CCATTACCAT CACCGCCGGG GAACGCTCTG CCATGGGTGT 1080 

GGGCAGCCCA TCACTGGCCG TTGTATCAGT GCCATGGGGT ACAAGTTCCA TCCTGflGCAC 1140 

ITIt^TCIXJrG CmCTQCCT GACACAGTTG TCGAAGGGCA TTTTCAGGGA GCAGAATGAC 1200 

AAGACCTATT GrTCAACCTTC CTTCAATAAG CTCTTCCCAC TGTAATGCCA ACTGATCCAT 1260 

45 AGCCrCTTCA GfiTTCCTTAT AAAATTTAAA CCAAGAGRGG AGAGGAAAGG GTAAATTTTC 1320 

TGTTACTGAC CTTCrrGCTTA ATAGTCTTAT AGAAAAAGGA AAGGTGftTCA GCAAAIAAAG 1380 

GftACrrCTAG ACTTTACATG ACTAGGCTGA TAATCTTATT TTTTAGGCTT CTATACAGTT 1440 

AATTCrTATAA AITCPCrrTC TCCCTCTCTr CTCCAATCAA GCACTTGGAG TTAGATCTAG 1500 

GTCCTTCTAT CTCGTCCCTC TACAGATGTA TTTTCCACTT GCATAATTCA TGCCAACACT 1560 

55 GCJTTrrcrrA GGTTTCTCCA TTTTCACCTC TAGTGATGGC CCTACTCATA TCTTCTCTAA 1620 
TTTGGTCCTG ATACTTGriTT CTTTTCACGT TTTCCCATTT CCCTGTOGCT CACTGTCTTA 
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CAATCACIGC TGTGGAATCA TGATACCACT TTTAGCTCTT TGCATCTTCC TTCAGrTGTAT 1740 
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lTi " iVrmT CAAGAGGAAG TAGATTTTAA CTGGACAACT TTGAGTACTG ACATCAOTGA 1800 

TAAATAAACT GGCTTOTGGT TTCAATAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1860 

5 AAAAAA ^^^^ 

10 C2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1152 base pairs 

(B) TYPE: nucleic acid 
15 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

20 CTCAAGGATG TAAAGGCTCT GCAGATTTCG GGAGGCCICT CTCCCAGCAC CTGATGGGAC 60 

ACTTTTTGCC CCACTGTAAA TTCTC5GGTGT ATCCTCCACT CTATGCTGrTC ACCCCAAGGG 120 

CAAGCACTGC atx:tgcttag TGAAGGAITT ATTGTTCGGA AGATACATTT TCCCCTTKAG 180 

CAGAGAiGTCG CC3TATCCPGG CAGTCTTCGG TGAGCCAGTT GTACCAGGAT TATGAAATGC 240 

AGATGTITAC TGrrcrCATTG TTGCTGTCAT TGCEACTGAG GAGTACTGAC CAGAATCATC 300 

30 TGCAACTYTT AGTTGGCAGA GAGGACCACT ATGGCGGGTA GCTCrrrTCT TTCCTCCCAT 360 

TGTOGQGATG ATTCCAGGCC AAAGATGATG GARAAGTATG GAAATCATCT GAAAGGTTGA 420 

AQCTTOGCAC GTCAAGCCAT TCATGACTTT GTAACGCAGT TTTGCTGAAG GCCAGTrCTG 480 

CCCTCGGAGG GACGGAGGTG AATCCTCCTG AGTACCICTG GTTTTCTTAC TTCCTGCTCA 540 

AnTACCTAA GTGCCTGTTG TTTGCTTGCT GTGGAGGCTT TCTGGTATTT CATTTCAGGT 600 

40 GCAGAaXSCCT TCACTTTCCC ACCRAAAAAA CCCCMACCAA ACCTAAGACC TTACIGCAAC 660 

TAAGmWCC AAGTACTTTT TAACCCAATG OGATGAACAG CCTGTGGTCT GCTCAGATCA 720 



25 
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780 



CCCTCAGTCC GTGTGaGAAG GCMIWGGCTT TGCCAGGAAA TCCAGGAAGG CAGGGCOGGG 

CrcrrcTTOGA AGCTGGCTTA GCTQGTGGGG CAGCCTTATT TCAATTAAAA GGGCATTGAC 840 

TGGGAGCAGC AGTCCTGGAG TTTGTTGCAT TTCCTATTGC CCTCAAAATG AGAAACCAGG 900 

50 AAAATAGCAG ATTGGAGCCT TOGAGAAGGC AGTAAATGGC TGTmTATT GACAAAAGGA 960 

AAACATTTTA CTGCCA3CTC ACTGATGGCA TCTCACTGAC TTAAAATGAA GGCAlKSTTGr 1020 

ACTAAAAAAA AAAGTCTACA TTTTTCCACC QCCACGTTCT TATATCCTGT TTGTCAGCCA 1080 

CTGCTCANAA GGGCATGTTG TCTTGCGGAN TANAGGCGCT CTCCTTCCCT CGrTTTTCCCr 1140 

ATAGGTTGGG TG ^^^^ 



60 
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(2) INFORMATION FOR SEQ ID NO: 68: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2483 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

10 

(xi) SEQUENCE DESCRIPTICN: SEQ ID NO: 68: 

AGCAGGCGGT GCGCTGGGGG CGGGAGCAGC GCGKAGCCCG GCTCGGCCAC ACCGATCGCC 60 

15 CGCCGCCATC GGCrcCTCGC AAAGCGTCGA GATCCCGGGC GGGGGCACCG AGGGCTACCA 120 

CGTTCTGCGG GTACAAGAAA ATTCCCCAGG ACACAGAGCT GGTTTGGAGC CTTTCTTTGA 180 

TTTTATTGTT TCTATTAATG GTTCAAGATT AAATAAAGAC AATGACACTC TTAAGGATCT 240 

20 

GCTGAAASCA AACGTTGAAA AGCCTGTAAA GATQCTTATC TATAGCAGCA AAACATTGGA 300 

ACTGCGAGAG ACCTCAGTCA CACCAAGTAA CCTGTGGGGC GGCCAGGGCT TATTGGGAGT 360 

25 GAGCATTCC?r TICTCCAGCT TTGATGGQGC AAATGAAAAT GTTTQQCATG TGCTGGAGGT 420 

GGAATCAAAT TCTCCTGCAG CACTGGCAGG TCTTAGACCA CACAGTGATT ATATAATTGG 480 

AGCAGATACA CTCATGAATG AGTCTGAAGA TCTATTCAGC CTTATCGAAA CACATGAAQC 540 

30 

AAAACCATTG AAACTGTATG TGTACAACAC AGACACTGAT AACTGTCGAG AAGTGATTAT 600 

TACACCAAAT TCTGCATQGG GTQGAGAAGG CAGCCTAGGA TGTGGCATTG GATATGGTTA 660 

35 TTK3CATCGA ATAOCTACAC GCCCATTTGA GGAAGGAAAG AAAAITrCTC TTCCAGGACA 720 

AATOGCTCGT ACACCTATTA CACCTCTTAA AGATGGGTTT ACAGAGGTCC AGCTGTCCTC 780 

ACTTAATCCC CCGTCTTTGT CACCACCAGG AACTACAGGA ATTGAACAGA GTCTGACTGG 840 

40 

ACTTTCTATT AGCTCAACTC CACCAGCTGT CAGTAGTGTT CTCAGTACAG CSrPGTACCAAC 900 

AGTACCGnTA TTGCCACCAC AAGTAAACCA GTCCCTCACT TCTGTGCCAC CAATGAATCC 960 

45 AGCTACTACA TTACCAGGTC TGATGCCTTT ACCAQCAGGA CTGCCCAACC TCCCCAACCT 1020 

CAACCTCAAC CTCCCAGCAC CACACATCAT GCCAGGGGTT GGCTTACCAG AACTTGTAAA 1080 

CCCAGGTCTG CCACCTCTTC CTTCCATC3CC TCCCCGAAAC TTACCrGGCA TTGCACCTCT 1140 

50 

CCCCCTOCCA TCCGAGTTCC TCCCGTCATT CCCdTGGTT CCAGAGAGCT CTTCTGCAGC 1200 

AAGCTCAGGA GAGCTOCTGT CTTCCCTCCC GCCCACCAGC AACGCACCCT CTGACCCTGC 1260 

55 CACAACTACT GCAAAGGCAG ACGCTGCCTC CTCACTCACT GTGGATGTGA OGCCCCCCAC 1320 

TGCCAAGGCC CCCACCACCG TTGAGGACAG AGTCGGCGAC TCCACCCCAG TCAGCGAGAA 1380 

GCCTGTTTCT GCGGCTGTGG ATGCCAATGC TTCTGAGTCA CCTTAACTTT GAACCATTCT 1440 

60 
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TTCGAATTGG CGTGGTATAT TTAACCACGG GAGCGTGTCT GGAAACC3CAA ACTATCATTA 
ATTTCATACT AC?rrPGTACC GTATCTGTAG GCATCCTGTA AATAATTCCA AGGGGAAAflC 
TAAACX5AGGA CX3TGGGTTGT ATCCTGCCAG GTTGAGTGGG GCTCACACQC TAGGGTGAGA 
TOTCAGAAAG CX3CTTCTATT TTAAACAACC AAAAAGAATT GTAAGGGTGG CTTGCTGCCA 
GGCTTGCACT GCTGTTCCTG GGQGrTGTGCA TCTTCGGGAA AGGPGGTGGC GGQQCGTCCA 

CTAGcynrcc ictcccctgc tgctccitcc cjtaagaaaat gaaatattct atgcctaata 

CTCACACGCA ACATTTCTTG TACTTTGrTAA GTCGTTTGCG AGAATGCAGA CCACCTCACT 
AAACTCTAAA CGGTAAAGAG ATTTTTACTT TTGGTCTCCG TGAGTCGCAT CTCTACTAAG 
GTTTACACAG GAATTCCACC TGAAGACTTG TGrTTAAAGrrT CTACAGOGCG CACTGTTAAC 
TCAAOJrCTT TTTCTTCAGC CTATACGCX3G ATCCTTGrTTT TGAGCTCTCA GAATCACTCA 
GACAACATTT TGTAACTGCT GCICTTGCTT TCTACATACA CCTTATAAAG TGACATTTCA 
AAAGAAATAA GGrTQCCACAG TTTEAAACCA GAAGGTTGGCA CTCTOrGGCT CCTTGTAGTA 
TTATAGCEAT ACTOGGAAAG CATAGATACA GCAATAAAGT ACAGTAATTT TACTTTrTTT 
CTTGTOTTAC ATCTAAATTA CAACCTTTAA TTGCCAOGTG TGCACTTACT ACTCTCCAGT 
ATOTCTTATT ACTCTCCAGT ATGnCAOGCA TCrTTAACTT TTCACGTCCT ATGTITGCIT 
TCTCCCATTT TTAAGAGATG GTAAGrTTAAC TGGAATTGAT TTACTGAATG AAATTAAATG 
CAGATATCCC TGrTTTTTGAA ATAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
AAAAAAAAAA AAAAAAAAAA AAA 



1500 
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40 (2) INFORMATION FOR SEQ ID NO: 69: 

<i) SEQ^JENCE CHARACTERISTICS: 

(A) LENGTH: 536 beise paixs 

(B) TYPE: nucleic acid 
45 (C) STRANnEENESS: double 

(D) TOPGLOGY: linear 

(xi) SBQJEtiSCE, DESCRIPTION: SEQ ID NO: 69: 

50 GAGAAATGGA GCITrGTTAG ATAAAAATTT TTTCAACGCA AACAGTCATT TTCCAGTGAA 60 

AGGAGAGCGT ATCCGCOGTA GGATGGACTT AGATCGTGTA AAAGCTGAGG CCACCGAGGA 120 

TATAACCTCC GQGGrPCCTET GCCTCCTTTT CCTTAGACTC CCTCCAAACT OGTGTATCTT 180 

55 

TCCTTCAGCA GTACTQGGCT CCACGCGAAC CTAGTCCTTT GTCTTTACCC TATTACCTTT 240 

CATAACATCC TAGTPGAAAA GTARTTATTC AACCGCGTTT GAAAATGAGA ACAGGTTCAC 300 

60 AGARGCTAGG TTACTTGCGA AGGTCGTTCA ATTAGTAACC AGTAACGCCA GGACTGCCAG 360 
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TTTCITOCTT CX35AATTCTC ATGGTAGCIT TCACCARGCT CCCCGTCMAA TGCTAACGTC 420 
AACTACTCAA CTAGATTAGC AAAAAGGTCT TTTAACAGAA ITCCTGCTnT TCAGAGAGAG 480 
TITCITrCAT GAAGCGCCCC ATTTCTACAG AGGAAAATAA ACTCCAAGCA GCCAGT 536 



(2) INFORMATION FOR SEQ ID NO: 70 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 865 base pairs 
15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: do\lble 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

ccACGCCTCc QGCcmcrr GGCCAGAGGC GCCGGTTGGA CTCACGGGCG GGGCATGATG 60 

GGTAACAGGA CCGGTQGQGT COCCAGGAAG TCCTAGAGGG GGTCGGGGTT TGGGrTGGACA 120 

25 AGCTTTCCTC GTCCTCTCCC GACAGAGCTG ACGTGTCCTG QGTTCCACCG GGAGCGGGCA 180 

TTTCCACCGG ACQGGAGGGT TOGQGGTGTC CGGGGCTGGG GAATACC3TAG GGGTTGCCGC 240 

GCGGTCTOGG GAGTTGGGGC GrGTOGCTGC AGTCCCGGGA GrTTCTTGGAG GGGGTrCGGCC 300 

CACCGAGCTT CCGGACCQGC TGATCTGCCC CTAGCTTGCC QGANGGARGG CGGAGCTGAC 360 

TCTCCGTCCC TTCrcCCATC CCCTCCAGTG GTGGGTACGG GCACCTOGCT GGCGCTCTCC 420 

35 TCCCTCCTGT CCCTGCTGCT CTTTGCTGGG ATGCAGATGT ACAGCCGTCA GCTQGCCTCC 480 

ACCGAGTGGC TCACCATCCA GGG0GC3CCTG CTTGGTTCGG GTCTCTTCGT GTTCTCGCTC 540 

ACrcCCTTCA ATAATCTQGA GAATCTTGTC TTIGGCAAAG GATTCCAAGC AAAGATCTTC 600 

40 

CCTCAGATTC TCCTGTGCCT CCTGTTGGCT CTdTTGCAT CTGGCCTCAT CCACCGAGTC 660 

TGTGTCACCA CCTGCTTCAT CTTCTCCATG GTrQGTCTGT ACTACATCAA CAAGATCTCC 720 

45 TCCACCCTCT ACCAGGCAGC AGCTCCAGTC CTCACACCAG CCAAGGTCAC AGGCAAGAGC 780 

AAGAAGAGAA ACTGACCCTG AATGTTCAAT AAAGTTGATr CTTTGTAAAA AAAAAAAAAA 840 

AAAAAAAAAA AAAAAAAAAA AAAAA 8^5 

50 



55 



(2) INFORMATION FOR SEQ ID NO: 71: 



(i) SEQUENCE CHARACTEEaSTICS: 

(A) LENGTH: 932 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: double 
60 (D) TOPOLOGY: linear 



V 
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(xi) SEQUENCE DESCRIPTION: 
TCATCATATA CAAAGTTTTT CGTCACACTG 

5 

AGAACATAAG GTCTTGTGCA AGAGGAGCCC 
GGATCTTTGG GGTTCTCCAT GrTCTGCACG 
10 TCAGCAATGC TTTCCAGGGG ATGTTCATTT 
TTCAAGAAGA ATATTACAGA TTGrTTCAAAA 
AAACATAGAG AATGGTGGAT AATTACAACT 

15 

GACCAATGTA TAAAAATGAC TCATCAAATT 
TTTTAAATCA C?mTTCTGT TTATGCTATA 
20 ATCATATAGA TATACTATGT TTTTCTATGrT 
ATATTTGGAA AGTAATTGGT TTCTCAGGAG 
TTTCTAACAC GAGAAGTATA TGAATOTCCT 

25 

ACTCGTGrrTG CCTTTGAAAC TAGTCCCCTA 
GTGGAACATA AGAGAATCAA GGGGCAGAAT 
30 TATTTTGAAT GAACTGrTTTT TTCTGTAGAC 
AATTGAAGAA ACACATTTTA CCATTTAAAA 
CCAAATCQCC GCATAGrTGAT CCSTTAAACAAT 

35 



SEQ ID NO: 71: 

CAGGGTTGAA ACCAGAAGTT AGTTGCTTTG 60 

TCGCTCTTCT GTTCCTTCTC GGCACCACCT 120 

CATCAGTGGT TACAGCTTAC CTCTTCACAG 180 

TTTTATTCCT GTCTGrTTTTA TCTAGAAAGA 240 

ATGTCCCCTG TTGTTTTGGA TGTTTAAGGT 300 

GCACAAAAAT AAAAATTCCA AGCTGTGGAT 360 

ATCCAATTAT TAACTACTAG ACAAAAAGTA 420 

OGAACTGTAG ATAATAAGGT AAAATTATGT 480 

GAAATAGTTC TGTCAAAAAT AGTATTGCAG 540 

TGATATCACT GCACCCAAGG AAAGATTTTC 600 

GAAGGAAACC ACTGGCTTGA TATTTCTGTG 660 

CCACCTCGGT AATGAGCTCC ATTACAGAAA 720 

AICAAACAGT GAAAAGGGAA TGATAAGATG 780 

TAGCTGAGAA ATTGrrGACA TAAAATAAAG 840 

AAAAAAAAAA ACTOGAGQQG GGCOOGGTAC 900 

CT 932 



40 



(2) INFOHMATION FOR SEQ ID NO: 72 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 996 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
45 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTICN: SEQ ID NO: 72: 

CGCCTOGCAC CATGAGGAOG CCTGGGCCTC TGCCTGTGCT GCTGCTGCTC CPGGOQGGAG 60 

50 

CCCCCGCCGC GCGGCCCACT CCCOCGACCT GCTACTCCCG CATGCGGGCC CTGAGCCAGG 120 
AGATCACCCG CGftCTTCAAC CTCCTGCAGG TCTCGGAQCC CTOGGAGCCA TGTGTGAGAT 180 

55 ACCrcCCCAG gctotacctg GACATACACA attactgtgt gctqgacaag ctgcgggact 

TTGIGGCCTC GCCCCCGTOT TGGAAAGTGG CCCAGGEAGA TTCCTTGAAG GACAAAGCAC 



240 
300 



60 



GGAAGCTGTA CACCATCATG AACTCGTTCT GCAGGAGAGA TTTGGTATTC CTGTTGGATG 360 
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ACTCCAATGC CTTGGAATAC CCAATCCCAG TCACTACGGT CCTGCCftGAT CGTCAGCGCT 420 

AftGGGAACTG AGACCAGAGA AftGAACCCAA GAGAACTAAA GTTATGrTCAG CTACCCAGAC 480 

TTAATraGGCC AGAGCCATGA CCCTCACAQG TCTTGTGTTA GrTCTATCTG AAACTGTTAT 540 

CTATCnCrCT ACCTTCTCGA AAACAGGGCT GGTATTCCTA CCCNGGAACC TCCTTTGAGC 600 
ATAGAGTTAG CAACCATOCT TCTCATTCCC TTGACTCATG TCTTGCCAGG ATGCrTTAGAT ' 660 

ACACAGCATC TTCATTTOGT CACCTAAAAA GAAGAAAAGG ACTAACAAGC TTCACTTTTA 720 

TGAACAACTA TTTTGAGAAC ATGCACAATA GTATGTriTT ATTACTGGTT TAATGGAGTA 780 

15 ATOCTACTTT TATTCTTTCT TGATAGAAAC CTGCTTACAT TTAACCAAGC TTCTATTATG 840 

CCrnTTCTA ACACAGACTT TCTICACTGT CTTTCATTTA AAAAGAAATT AATGCTCTTA 900 

AGATATATAT TTTAyCTAGT GCTQACAGGA CCCACTCTTT CATTGAAAGG TGATGAAAAT 960 

CAAATAAAGA ATCTCTTCAC ATGARAAAAA AAAAAA 996 
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(2) INFORMATION FOR SEQ ID NO: 73: 



<i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 785 base pciirs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 

QGCACGAQGG GCTTTGCGTA CACAATAGCT GCTAGGAGTA CCCAAAGCCT GARTACARCC 60 

TCCTCCHCTC ATGGCCAOGT GTGAGCAGGC CAGCGTCAMA CGGCTCGCTG TGACCCGrTCC 120 

40 CGRAGACTCA AA3GQGCCTC GGTCTTCTCC TKGTCCTGTG ATWAAAGTCC TCTdTCAAA 180 

CnXSGAGAGCA AAGGCACACA GAGGTGCGCG CTCACAAGAA TTCCTCCCGG TGACTGGGTA 240 

ATCAATGXTA CTGCrGTTTC CTTTGCAGGA AAGACCACAG CAAGATTCTT TCATTCGTCT 300 

CCTCCTAGCC TOGGQGACCA GGCTCGAACT GACCCTGGAC ATCAAAGGAG GGATTATGTG 360 

GCTCCTAAAG CCATCGGCCC ACAGCCCTGT TCACRTCTTG GrTGCTTCTCT TTCCCAGAGG 420 

50 CTOCnCCCAG CCAGGCACAC ACAAAAGGCA GATTCTCGTA AACSCAGCCT CCCTCCCTGG 480 

- AGGCroCCTC CTGCCCTGGA TCEGGAGTCG AGCTGCTCTG AGATTTTGAG TTCTTCTGCA 540 

GAGATGATTA AATATATCCA AGAGACATTG GAAAACCTGC TGAACATTTT ACATTGGTCT 600 

GCTCAGCACA TOGCTGGATG CGGATATTTC TATAATTCCA GAAAGTCACA CAGCICCTCT 660 

GTATGAGACC AGTOGGOGCC ATTTAAAAGA ACAGGATGAG AATCTAAGAT ATATTATTAA 720 

60 TAAAICTAAT GGATTTTTTT TTTGrTAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 780 
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(2) INFORMATIC3N FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 1069 base pairs 

<B) TYPE: nxicleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 

TCCTCACCAT TCCCCTAGGN CAGGTCCCTG CAGGTCCCAC ACTTCTCCCA GGTCCCTAAA 60 

CnOGCTrCGG TCCTTTCCCT GGAGTAGCTG GNTCCTCCAG TCC5AGGTCCC TGTTCAGTCG 120 

20 

GrrCTTAGGC TCCTCCACAT GAAGGTGnXTT GCCTCrrGGTG TCJTGGGCTGC TCTAGGAGCA ' 180 

GATACAGGCT GGTATAGAGG ATGCAGAAAG C3TAGGGCAGT ATGTTTAAGr CCAGACTTGG 240 

25 CACATOGCTA GGGATACTGC TCACTAGCTG TGGAGGTCCT CAGGAGTGGA GAGAATGAGT 300 

AGGAGGGCAG AAGCTTCCAT TTTTGTCCTT CCTAAGACCC TCTTTATTTGrr GTTATTTCCT 360 

QCCTTTCCGA GnCCTCCAGT GGGCTGCCCT GTACCCTGAA CCTCATGAGC CTCTAAQGGA 420 

30 

AAGGAGGAAC AATTAGGACG TGGCAATGAG ACCTGGCAGG GCAGARTACA AGCCCAGCAC 480 

CAGTCTCCCA GCCTTACTOG GTCCTTACCC TGGGCCAAAC AGGGAGGGCT GATACCTCCT 540 

35 'roCTCTTCCT AGATGCCCAC CTCCTACAAT CTCAGCCCAC AAGTCCTCTC CACCCTAGGG 600 

GGCTTOCTGC ATGGCAATAA CTCaTAATCT GATTTGGAGG TTTGCCCTTT ACAGGGGCAG 660 

ATTTTCTOCT CAGTTCAACA ATGAAATGAA GAGGAACTCC CTCnTCTAC AGCTCACTTC 720 

40 

TATCAGfiGGC CCAGGTGCCT CAGAGCCACA TTGAGTTGCT TTTTCTGQGA TGAGGAAGTA 780 

GGGITAAACT CCCCAGnTTC CTGAGGGAGG CTCCTGACAG GTGCCCTTTG TCAGACOCTA 840 

45 CCACAGCCTG GATAGGCAGC CACATTQGTC CTCGCCCTTG CTCGCaiACTC CGTGGTGGTC 900 

crocccncT CCCTOCATGC CTGTGGGTCT QCTCTQGTGT GTCAAGGTCG GTGGGTTAAC 960 

TGICTGOCTA CTGAACCTGG CAAATAAACA TCAOCCTGCA AAGCCAAAAA AAAAAAAAAA 1020 

AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAA 1069 



50 
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(2) INFORMATION FOR SEQ ID NO: 75: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 831 base pairs 
60 (B) TYPE: nucleic acid 
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(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 
GGACATTAGA TCACTGrTGGA CCTAAAACAA ACAAACAACT ATAAGGAAAA TGGCATTAGA 
AATGGTCTGG GGATCAGTTT ATCACTGCAG TTGTTACATC ACCCCATGGT CTAAAATACA 
GAGCTTTAGT CTGTCTCTGT TTCAGTTCAT TTTACAGGAG GTGAACATCA CACTTCCAGA 
AAACTCTGTC TGGTATGAAA GGTATAAATT TGATATTCCT GTCTITCACT TGAATQGCCA 
GTTTCTCATG ATGCATCGAG TAAACACCTC AAAACTTGAA AAACAGCTCC TGAAACTTGA 
GCAGCAAAGT ACTGGARGCT GACTGATGCC CTCATGATTT TCCACCCTCT CTTCCCATAA 
AGCATCTTCC TAAGGAAATG AMCATGGCCT GATACTCATT TTGTCACTTG TACAGAGCCC 
TAAGGATGTT CTGAATTCAG TGGTGCCAAA TAAATGTTGA CATTCCCCTT TTGGrrTGATG 
GAAGTATCAG TGTGGGAACT GTTTGCTTAA TGGCATTTTA TAAAATAAKA AKAKCATATT 
AGCAGGGAGG GAGATGATGG AGGGAGQGAG AAGTCCATrX GTCTTATTTA TCCTTTTTGT 
ATTAATAGAG AAGCACTTCA CAGTCACTGG CAATGCCATT TATAGGAAGA AGGTTCTGCA 
TTCCTGCTQC TCCOGGAGGG CTTAACmT TAATGAAAGA ATAAATGCTC TTCCACICAG 
TAGATAAAGT GAAATGTGAA TTGTTAATAA CTGTGCACGG TCAATAAAGC GATGTTTTAA 
GGAATACAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAACTCG A 

(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTEEaSTICS : 

<A) LQJGTH: 590 base pairs 

(B) TYPE: nvicleic acid 

(C) STRANDECNESS: double 

(D) TOPOIXXSY: linear 

(xi) SEQUENCE DESCRIPTICN: SEQ ID NO: 76: 
TATATATAGA OKSTTAATAG TOGOXSANTGN TGTGNACGAA CATTAACGGA AGTAGCATGT 
AGCCAGTCGA ATAACNTATA AGGACAAAGT GGACJTCCACG CGTGCGGCCG TCTAGACTAG 
TGGATCCCCC GGCTGCAGGA TTCGGCACGA GCTGCCAGGT GAGGAGCAGA GAGACTGTTC 
CCTTQGGTGG AGAGGTGrTGG GCATGAGAGC CACCCATTGC CAAGCAGCAA GAATGTTCGT 
QCTTTTTTCC CTTCCAAAAT ATGCAGQGCT CAGGCTCCCA ATTCCGGGCC TGTCTGCTTT 
GCTTGTGrrTT CTCCTGTCCC TGTTCTCCCG GAGGGCCCAG GTGGAACTCA CGACAGGGAG 
GGAGACGCTT CCCAAAAACC TGCAGGGCTA TTTCCCAGAA TTTGGTTTTC AAGTACAAAA 
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CTTTTTGrrCC TGTAAGATAT ATGCAGCCTC ACAGAAGCAG CCTCTGCCTC CACTTTACCA 



480 



GCTAOGTrrr tajtcttaagc acatggggct cccttagaac ttactccact gatttaaaaa 



540 



AAAAAAAAAA AAACTCGAGG GGGGGCCCGG TACCXATTCG CCCTAAAAGT 
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(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1274 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECNESS: double 

(D) TOPOrOGY: linear 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 77: 

GAGCCACCAC ACCTGGCCTG GAAGGAACCT CTTAAAATCA GTTTACC3TCT TGTATTTTGT 60 

TCTGTGATQG AGGACACTGG AGAGAGTTGC TATTCCAGTC AATCATGTCG AGTCACTGGA 120 

CTCTGAAAAT CCTATTGGTT CCTTTATTTT ATTTGAGTTT AGAGTTCCCT TCTGGGTTTG 180 

TATTATGTCT GGCAAATGAC CTGGGTTATC ACITTTCCTC CAGGGTTAGA TCATAGATCT 240 

TGGAAACTCC TTAGAGAGCA TTTTGCTCCT ACCAAGGATC AGATACTQGA GCCOCACATA 300 

ATAGATTTCA TTTCACTCTA GCCTACATAG AGCTTTCTGT TGCTGTCTCT TGCCATGCAC 360 

TTGTGCGGTG ATTACACACT TGACACTTACC AGGAGACAAA TGACTTACAG ATCCCCCGAC 420 

ATGCCTCTTC CCCTTGGCAA GCTCAGTTGC CCTGATAGTA GCATGTTTCT GTTTCTC3ATG 480 

TACCTTTTTT CTCTTCTTCT TTGCATCAGC CAATPCCCAG AATTTCCCCA GGCAATTTGT 540 

AGAGGACCTT TTTGGGGTTCC TATATGAGCC ATGTCCTCAA AGCTTTTAAA CCTCCTTQCT 600 

CTCCTACAAT ATTCAGTACA TGACCACTGT CATCCTAGAA GGCITCTGAA AAGAGGGGCA 660 

AGAGCCflCTC TGCGCCflCAA AGGTTGGGGT CCATCTTCTC TCCGAGGTTG TGAAAGnTTT 720 

CAAATrCTAC TAATAGGSTG GGGCCCTGAC TrGGCTGTGG OCTTTGGGAG GGGTAAGCTG 780 

CTTTCTAGAT CTCTCCCAGT GAGGCATQGA GGTGTTTCTG AATTTTGTCT ACCTCflCAGG 840 

GATGTTGTGA GGCTTGAAAA GGTCAAAAAA TGATGGCCCC TTGAGCTCTT TCTAAGAAAG 900 

GTAGATGAAA TATCGGATGT AATCTGAAAA AAAGATAAAA TGriGACTTCC CCTGCTCICT 960 

GCAGCAGTCG GGCTGGATGC TCTGTGGCCT TTCTTGGGTC CTCATQCCAC CCCACAGCTC 1020 

CCAGGAACCT TGAAGCCAAT CTGGGGGACT TTCAGATGTT TGACAAAGAG GTACCAGGCA 1080 

AACTTCCTGC TACACATGCC CTGAATGAAT TGCTAAATTT CAAAGGAAAT GGACCCTGCT 1140 

TTTAAQGATG TACAAAAGTA TGTCTGCATC GATGTCTGTA CTGTAAATTT CTAATTTATC 1200 

ACTGTACAAA GAAAACCCCT TQCTATTTAA OnTGTATTA AAGGAAAATA AAGTTTTGTT 1260 
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TGTTAAAAAA AAAA 

(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) I£NGTK: 1133 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: doxible 

(D) TOPOIjOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 

AGGATnrrc cttgttcaac caaaatctga gcattctttc tatgttgaaa acactgaaaa 

ACTAATTTWA GTTAATGAAC TAGAAAGAAT ATTGATTTIW AAGAAACAGA AAAATACTAC 
TTATTTTCCT TCTCAAATAA CGrrTTCTTTC AAAAACTTCT GGCTGAAGTA TAACATGCTG 
GTAGTTAACA TAAATCTTGT CTTTCTCTTG TTCTTTATCT TTCTTTGrrTA TTTAGATGCT 
TCTATAAATG TCTnTGTTT TTATTAAGTG CCTAATTGAC AGAGCTTAAT TTGAAGAAGT 
GCCCTAATTT ATTGACCACT TAAGAATTGC CTTTATTQGG GTATTTTATT TC3TTCCTGCG 
TCTTTTTGAT GTTGTTCAGT CTACTCftTCC CTGrTGAGTAT GTGTGGGQGA CAGCTGATAG 
AAGGGAGGAG AGTGTGTCTA TGCTCAQGAT TGCCCTTTAG CCACTCAGCC AGAGATCCAC 
AGGGAGCAAC AAGGACAGTT TCACATQCTT AGACTTTCTT GGAAGAAACA GTGAGGAGGA 
GTAACJTCGTG AGTAGTGTCA AGCTGGATGT AGAATTGTCC TAAGGCAGTr GACCCCACCT 
TCCAACATGT TTTCACITEA TTTGCCCCTC CCTACATTTG GGTTAGGTTC CATTTGGAIT 
TGCAGCAATA ATGACTTTAT TTCTCTCTTG GTCAGGATTT GGCACATAAA ATCCTTTTAT 
TATAGAACTA GCTATTTTAG TTACATAGTA ATGTAACTAA TGGAGAGATT TATAGAGAAT 
TriXJK?lTrm CTGTCATATA TGTCCATTTT GGAGACAGAT ATGATAGAAC TAGAAATTAA 
GTTGCATTTC TQCAAGTGCC ATTTGAATGA ACTTCAAGTA TCTTCTTAAT TATTAAATTT 
TCTGATGAAG GCATTGrCAAC AAATATATAG TATTATTAAA TCTAATTAAT ATTTGGAAAT 
ATTAATAAAT AGGTATTTTA TTEACTGTEAA AAAGTCAAAC TTCATTATGT AGATAAflTCT 
TATTCTTTTC ATTCTTTCCC CTGTTTACAT CCTTTTTACA AAGCTTAGTC ACCAATTAAA 
GCmCCTAT CAAAAAAAAA AAAAAAAAAA ACTCGAGACT AGTTCTCTCT CCT 

(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 661 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECNESS: double 

(D) TOPOLOGY: linear 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 
GAATTCGGCA CGAGGQGAAA AGGATGCTGA ACGAGAGCAG AAAGCCTCTT TCCTTTGCTT 60 
10 CACGCCTTTC CAGTCTTTAT TTTAAACTCG GGTTCCCTTT CTGTGGTCGC AGCAACCTTT 120 
ACTCCACCTG CACTGCTGCT CCTGGGGGCT CCCCAGGCCT CCCTCTGCCT TTCTACXXrAG 180 
TGGCTGACGG GATGCCTGTC TTQCCTGGAC GCACCACTGC TCHXTGTCC CTCACCTTGG 240 

15 

CTTITCCTOT GCCCTGCTCT GGGGTTGAAG CPGGCCCATG TGTCCCCCGG AGTCATC3GCT 300 
GCTCCTCCTG GGAGGCCTCT GrGrPGCGTCA CX3TCTTCCAC ACCTGGGGGC AGCTGGCGAG 360 
20 CXCGTGCTCT GTTCCXXTCG GCTGCTTGGC ACAGAGXTGC AGCCTGGGAY TCTCCGTGGA 420 
CCCAGACTGG GGATTTTGCX: AGGGGGGCGA TGGGAGGAGC AGGTGCTTTG CCTGC3CGGCT 480 
GTGTCTGCAT TTCTQGACGC CCCAGAGCAC AGAAGTTGCC GGCACTTTGA GGTCTTCCTC 540 

25 

QGCATGTGCC AGATTACATG AGTGACXSGCT GGGAATATCTT TTPCTTTTTT CTAATGGAGG 600 
CGTCnTCAC ATATACTAAA GCTCACCAAA AAGTAAAAAA AAAAAAAAAA AAAAAACTCG 660 
30 A 661 



35 (2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH:* 1378 base pairs 

(B) TYPE: nucleic acid 
40 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

45 ATTQGGTACC GGGCCCCCCC TCGAAGTTTT TlTm TITT TTTTAATGAA AGCTCTCAAA 60 

TAAGCGATTT TATTCCTMC CATGATTGCA GACATTTACA AAACCATAAC ATCTGAGTTC 120 

ACCTTAAAAA ATAACTTATA TAAAGCAGTG ATATACACAG CACAAAATAG TTCAGGGAGG 180 

50 

QOGCAOGAGC AACTTGTAAT AATTAAAATG TAAACGTGAA AAAAAGGATG GAATAAAACT 240 

CCCTACTTAT TTCTACTTAA GATGTCATGT GATAATATTT TACAATGTOC TGTGGGTCAA 300 

55 TGTATGTATG TGTATATGTC TGTATAACAT ACACATATAC AGTACATTCT CTTTCCCACA 360 

CATATACATA CACACATAAT TATTTGCAGT TCAGTTTAGG GCAATTCTAA TATGCCACTC 420 

CGTACAGTTG TTTGAATCAC ATTTQGACCC GCTTTCTTCA CAAAAGAGGG GAGAGAGCAG 480 
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10 



15 



20 



25 



30 



GAAATAAAAA GGTTGGTTTG GTGTGACTGA GATTCCTrTG TTTAACTGTA CACTGTGATG 540 

AATAATTTTC TTCCGTACTA GTTCKJIGAA GQGCTGACTC ACTGTGGTTT TCATGAGGAG 600 

ACrrOGTAAT GGATCACACG CTCATTGrCA TGCTAGGGGA GTAACTCTCA CTCTGAAAAG 660 

GATTTAAGAA ATTTCCCCCC ATTTCGCCAT CATCCCTTGG AGTGCTCGGT TGATTACTCA 720 

GGCTCATATT ATTCGGAGAA TTCrTOGAAA TACTOTCCAT ATCTCCTGAG CCTAAAGAGC 780 

CATTCArorc ATCTGACTCC ATTCCTCCTA ATCCACXXZAT GGGACCAOCT GACCCAGGRC 840 

CXMTGGAAA ATTAGGTCTG TTA3GTCCAG GAGGTACTGC ATTCATTAAA GTATACATGT 900 

TATCACCAGA GTrGGTTGAA TCTGCTGGAC TAGGCATGAT GGGTOTTCCT GGTOGCCCTC 960 

CACCTCCTGG AQGACCTACA TAATTCCCAG GAGATGCTGA GGAGTATGGT ATTGAATTGG 1020 

CATTTGTTGG GTTTGGCCAA GGTrCTACCAC CACCTGGACC CATGTTCATr CCAGGCATTC 1080 

CAGQGCCACC TAAAGCATTC AGTGGGGGTC TCATTGCACC TCCATAGTTC TGTGGTCCTA 1140 

AGGGCACCAT TCCTCTPQGA GGAGTCftTTC TCTGCATTQG CCCACCCATA TTTGGATGTC 1200 

CTTGTTGTCG AGITGGATCC ATTCCACTGG GGAGTAATGG CTGACTTCCT GGGACACCTC 1260 

CAAGTGCCTG ATTAGGTATC CTCAATGGGG GCCTTGGACC TCCftGGGTAC CGAGGTGACA 1320 

TAAAAGGGTA ATCATGGAAG GCTTTTGCTT CACTTGAGrrG TTCACATCTCT TCACGTCT 1378 
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(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LQIGTH: 1440 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECNESS: doiJtble 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 

ACTTTGTCCA AATGTGrrCTG TCACATGTAG TCAGCTC3JAG NAATTTAAAA TGAATPGCCA 60 

AGTGAAGAGT CTGTGGATTA ATTGGCCGTT AATTAACAGG CTTTATCAAT GICTCCTCAA 120 

GGGAGAGGCC CAACCCTAAT TAAQGAGCTA AACTTCCTGA GTGAGGGGCT GTGAGGATGG 180 

AQGTGGAGGA GGCATCTQGG GCGGGTGGTG GCCGGGCCAG CAGATGGCGC CTCCCTGGCT 240 

GAGCTGCCCG CACCGCCAGT TCCCTCATTT CCACTCAGGA AGGCAGAGAA GGCAGAGTGA 300 

TCTCCTCAAG GAAGAGCITC CCCAGCCTTC GGGAGCAGCT GGCAGGGCGT CCGGGAATAA 360 

GCCCTACACG CCGCCGCCTG CCTCCAACTC ACTAACCCTG CGCCTCTTGT CTTTCAGATT 420 

CAACGCGTTC AACAGAAGCC ATCCCCAGCC CAGCTTAAAT TATAAAGATA GACAATAWTT 480 

CTGTTCCAAT CTGCGTGGTG CTTCTTTAGT AAATACTGTA CAGATTTTAC CATQGAGAAC 540 
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TTTTTTTTTA GTTTTTACCT TTTCTTAATT ACCCTTATTC CGAATGGACG AACACTTTCT 600 

ACCACTGCTG ACCATTGTAA AATACCGrTGT ATATAAATCC CATTGAAATA ATGCXXTOGA 660 

5 

ATAGAACATC TCAAATGCTG CTTAATTACA GACTCAGGTC GATTACTTGT ATTTCATGTA 720 

ATGTTCCTCC AAGTTAGACA TCTGGTGCAA GACCAACCGG GAGACCATGG AATTGTCAAA 780 

10 AGTACAAACT GACAGTGTGT ATATTTAATT TAAAGACTTA TTTAAAAACT CACAAGCTCT 840 

CACCTAGACT TTGGAGAGCA GTCTGTTTTC TGTAATGTCT GATACTAGAA ACTAATTTGC 900 

TTATTTTAGT TGTATTCAAG ATTTGAAGAT GTATTTTATA GACAAGTTCT GTTTTTGAAC 960 

15 

TTTGTQGAAC TGTTCCAATC AATCAATTTC CCAGTTATGA TGAGTATTTA CATTATGAAT 1020 

GTATAACCCA GACATGArTT GTAAAGCCGA CAGTATGTTT CTATTACACA ACACTTTTTG 1080 

20 ATACAGCGTC TCTTGrTCTTC ACTGATACTG GAGTCTCCCT TGTCTGOJNG GTCCXnTTCGA 1140 

GTTTCTAGTT ACAGACACAA TCATACTGTC ATTTTATTTT TAATATGGAT ATGCTATCAA 1200 

ACTGTGATAC ACTTATAATT CACTGGTCCT GCATCAGGAG ATGGAGTGGG GAAAACTGTA 1260 

25 

TTTAATACAG TTTGTATCTG AATAATCTGT ATGGrrTTATA CAGTTTGrTGT TGTTCAGAGA 1320 

TGTTTAAAGT TTGATCTTTC5 TTTTTCTAAA GATTAAAAAA GCACTTGCCC CACTGTAAAT 1380 

30 ATACAGCATG TAAAATTTCT RTAGTATATA AATGGCAGCA AATCACAAAA AAAAAAAAAN 1440 



35 (2) INFOKMATION FOR SEQ ID NO: 82: 
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(i) SEQUENCE CHARACTORISTICS: 

(A) LENGTH: 1381 base pairs 

(B) TYPE: nucleic acid 

(C) STElANDEEaJESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 82: 

AGGAATTCGK YACGAGGCCA GCAGTTOCTC CCAGrTTCAGG AGGrTGCTCCT 60 

CTTACCCTOGC CACAGCCCAA TCCTCCCACT GCTGACATCT GGGGAGACTT TACCAAATCT 120 

ACAGGATCAA CTTOCAGCCA GACCCAGCCA GGCACAGGCT GCXTTCCAGTT CTGACCTGAG 180 

CACGGTTTTT CCTCATGTGA CTTCTQGGAA GGCGCTCCCT CATCTGGGCC AAAGGAAGGA 240 

GGACGAAGCC CTCCTCAGCT GGCCTGTOTT OGGGGCATGA ATCTCTOCTC TCCTCCTTGT 300 

CTOGCrcror TGACAAACCG GGCATGrriG GCAGTAAATT GGCACCGTGT CACACTGTTT 360 

CCTGGGATTC AAGTATQCAA CCAGAACACA GGAGAAGAAA AGCTCCAGGA TCCCTGTCCC 420 

CATCTGTCCT CTTGATGTGA GAGAGACTCT GAGACTTCTT CCATCGCAAT GACCICTATT 480 
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AAACACAAGC CCCCCAAGCA AAAGAAGAGG TTGAGrTTGC TGCCAGGATT CAGATCAGCC 
CTTCCX^iGGG TCTGCAGGTG TCACATGATC ACAGTTCAGC GGGAGGCTTT CCGTACCCAC 
ACTGGCCCTA GCACTTCAGT CCATCTGCCC TCCAGAGGAG GGTTTCTTCC TGATTTTTAG 
CAGGTTTAGA GGCTGCAGCT TGAGCTACAA TCAGGAGGGA AATTGGAAGG ATTAGCAGCT 
TTTAAAAATG TTTAAATATT TTGCTTTGCT AATGTGCTGA TCCGCACTAA CTCATCTTTG 
CAAAAGGAAC TGCTCCCTCG GCGTGCCCCA GCTGGGGCXT CTGAAGGGAT TCCTCACTGT 
GGGCAGCrcC CCTGAGCTTC AGGCAGCAGT GTTCATCTCT GGCCAGTTGT CTGGTTTCCA 
TGTATTCTAG GCCAC3GTAGG CAACACAGAG CCAAGGCGGG TGCTGGAAGC CAGACGGAAC 
AGTGTTGGGG CAGGAAGGTG GATQCTGrTTG TCATGGAGCT GTGGGAGTTG GCACTCTffTC 
TGCTGGTQGC CCTCTCOGCT CACATGnTCA CAGrPGCAGCT CCTQGCAGAC TTGGGTrTTC 
TCTTTGGTGG TTTCTAAAGT GCCTTATCTG CAAACAACTT CTTTTCTCCT TCAGGAACTG 
TGAATGGCTA GAAGAAGGAG CTCAGTAAAC TAGAAGTCCA GGGTTGCTTG GnTTACTGGT 
TTATAAGAAA TCTGAAAGCA CCTCTGACAT TCCTTTTATT AACTCACXTTC TCAGrTTGAAA 
GATTTCTTCT TTGAAAGGTC AAGACCGTGA ACTGAAAAAA GTGTTGQCCT TTTTGCGGGA 
CCAGATTTTT AAGATAAAAT AAATATTTTT ACTTCTGrrCA AAAAAAAAAA AAAAAAATNT 
C 

(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1706 base pedrs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 
ACTGCACCAC TQCCCAQGTC TCCCGGCTGG ATGAAGAOGT GGTCCATGAG GAAGCTGGCT 
AGCTCAGACT GGAGACTAGC TTCAGGAAAA AAGACAAGTG GCCTAAGGAA ATCAOGGCCC 
CCAACTATCA TGTGAGGGCT AAAGATGAGA AGTAGATCAC TTAATAAGAC AAAAGCCTGT 
AGGGGGAAAA GAAAGGATGT TTAAAAGGAC AGAATGTTTC CCAAGGTAGA AATGACACTG 
TCAATTTCTC CTTQGAATGG GGGCAGGGAT ACTCGCCTTG TTGCTCCCAC TTGAGTCAGT 
ACTCACCTGC TCCTGGATCT CAGTATCCAC ATCTGAGAGG CAACTCTGGC AGAGTTCACA 
GAAGGCCACC ATTCTGTCCC TCAAACTCGA CAGCTGCTTC TGTGGGCACA GrTGGCTTGaA 
GGGGAAGAAT GAAGACACAG ACTCCTCTGT TCCCATTATC CCATCTAAGA CCCACACTCA 
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cx:togggaag CATCTGATTT AGAAATGTGG GTTAGTGTCC AGAGAATGGA AAAATAGACA 540 

AGAGTCAAGG CTGGCAGGAT AACCTGTAAC AACAAAGGGT TTGAAAAATG AQGTTTGGGT 600 

TRGGAGAGGG AGAGACAGAT AGCCAGAAAC ACACCAGTGA AGAGGAGAGA AAATGAGTAA 660 

AGGGAGAGCT AATTCCTTTT CXMTGGAAA ATCACrTGATA TTCTGGACAT TCTTCAGAGG 720 

CATCTACACG AAGTAGAAAT GTCACCGCTC CCTAATTTAC TCTACGTCTT CTAGAATCCC 780 

TCAATATTAT CCITCGCTTC CAGGAAATCC AAGAAGACCC TGGAAGTAGA GTCCACCTTC 840 

TAAGAGAGGA ATGTAAGAGG TGACCCCCAC CCACCTGATC TrCCTCGCTT TGTCCACTCC 900 

ACX^CACTGAG ACTTGACACA CCTAGTGGCC ACCTAGAACG TAGGTCCTTA AAATYTAGCC 960 

CCCCP£5CCCC CAACCCATCr CTAGCCTGTC CACTCACCTG GTGAGGAACY TrrCCTGTGT 1020 

CCACAGCYTT CTGCAGGAGT TGGCAACATG QCTCATAGAG CTCCCAGCGA GTCAGGTCAT 1080 

GAGTGCTTTG GGGGAGAAAG GGGAATCTTA TACTQGAAAA GAACAGAGGG AACCAACTCC 1140 

ACAGACACCA GTAAAAACGG GATGGGGAAG AGGAGGAAAG CCACTCACTT GTAGAAGGCA 1200 

GAGAGGOGTT TCAGAGTrGGC TGCCAGATTA TATACCTCAT CCTCATCTAG GAAGGACGAC 1260 

TGAGAAGGAA AGAAGATCCA CAATAGCATT TCCCCCAGAA CTCATCAGTC CACATCCCCC 1320 

GTCTTGCAGC CCCTCCCACC CTTGTTTGGG GTOTCCCATT GTCCAGCCXr AGCTCCTACC 1380 

TGTAACAGCT CTTCAAGCTC CTGCTGGAAR OGGTCAGTCA GCAAATCTAC TAGCTGGCTG 1440 

CGGGCAAAGT CCGCXXX3GCT GAAGAAAGTG AATTCGGGAT TACAGAGCAG GTAAGAGCAT 1500 

GCGCCCCAGC CTCAAGCACC GCTOGCTCTG CATQCTTCAC CACCACCTCC TGGAGTTCCT 1560 

GCAGGAACAG CTCCAGGfTGC TGAGAAGAAA AGGCAGAAGA TGGrTGTGCTG TGGGGATGGG 1620 

AGGAGGACAC TCTTCTGGCG GGAAGTOGAA CGGGGTTAAA AGCATTAAAC TTCAAGGAXA 1680 

AGATGCCEAA RAAAAAAAAA AAAAAA I'^OS 
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(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 
GAATTOGGCA CGAQCTTGGT AGCCTTAGAA CTGCATGAGC TGCTTTACCA CTGGGAAACA 
CGAGCACAGC CTAGCTTGAT TTTGTATGTG GTATCAGATC TAAGGTGGAT GGAATTCAGG 



60 
120 
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ACTTCCTCTC TACTCTTlXiA mwriTPA TTTTTAGAAA 


TGTTTTATTT 


TGTTTTATTC 


180 




ATTTATTCAT CTTCAGAGAC ATGGrTCTGGC TCICTTGCCX: 


AGGATGGAGT 


GCATGGTGTG 


240 


5 


ATCATAGGCC ACTGCAGTGT TGAGCTCCCG GGCTCAGGCG 


ATCCTCCTGC 


CTCAGCTYCC 


300 




TTAGTAGCTG GGACTATAOG CACATGCCCT ACCATGCCTG 


GCTTTGTCTA CTTTTTGAAT 


360 




GAOCTCYCAA ACTAGAAGGT CTATTAATTT AAAAAATTAA 


GGATAGCATG 


CCATAATTAA - 


420 


10 












AAATAATAAC AGTTGGGAAAA GQCACCTrCC AATGATTCAG 


ACATCAACTT 


GTGATTTAAA 






AAAACX3AAAA ATAAATAATA GGAAAAAAAG GGGAAAAAGT 


TAAATAAAAA TAAAATTAAA 


540 


15 


AAAAAAAAAA AAAAACTCGA GGGGGGCCCG GTA 






573 


20 


(2) INFORMATION FOR SEQ ID NO: 85: 










(i) SEQUENCE CHARACTERISTICS: 










(A) LENGTH: 684 base pairs 










(B) TYPE: nucleic acid 








25 


(C) STRANDEENESS: double 









(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 85: 



30 CTCTTTGGCT GTCTCTACCT CCTTCATCTG CTGCGCCGAC ATAAGCACCG CCCTGCCCCT 60 

AGGCTCCAGC CGTCCCGCAC CAGCCCCCAG GCAOCGAGAG CACGAGCATG GGCACCAAGC 120 

CAGGCCrccC AGGCTGCTCT YCACGTCCCT TATGCCACTA TCAACACCAG CTGCYGCCCA 180 

35 

GCTACTTTCG ACACAGCTCA CCCCCATGGG GGGOCGTCCT GGTGQGCGTC ACTCCCCACC 240 

CACGCTGCAC ACCGGCCCCA GGGOXTGCC GCCTGGGCCT CCACACCCAT CCCTC3CACGT 300 

40 GGCAGCTTTG TCTCTGTTGA GAATC3GACTC TAOSCTCAGG CAGGGGAGAR GCCTCCTCAC 360 

ACTGC?rCCCG GCCTCACTCT TTTCCCTGAC OCTCGGGGGC CCAGGGCCAT GGAAGGACCC 420 

TTAGGAGrrrC GATGAGAGAG ACCATGAGQC CACTGGGCTT TCCCCCTCCC AGGCCTCCTG 480 

45 

GOPGTCATCC CCTTACTTTA ATTCTTGGGC CTCCAATAAG TGTCCCATAG GTGTCTQGCC 540 

AGGCCCACCT GCTQCGGATG TGCTCTGTGT GCGTGTGTGG GCACAGGTGT GAGTGTGTGA 600 

50 OTSACAGriTA CCCCATTTCA GTCATTTCCT GCTGCAACTA AGTCAGCAAC ACAGTTTCTC 660 

TGAAAAAAAA AAAAAAAAAA AAAC 684 

55 

(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 
60 (A) LENGTH; 1036 base pairs 
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(B) TYPE: nucleic acid 

(C) STRAND[EDNESS: double 

(D) TOPOLOGY: linear 

5 (xi) SEQUENCE DESCRIPTICN: SEQ ID NO: 86: 

TGGAGGCAGA TGCACAGGAG AAAGGTTCCX: GTCCGCACCC TCTCAGACCT GAGGCTGAGC 60 
TTGCAGrrcAG GGCTTCTCCT CGGCCCCTCG CCCGCCCCCA GAGCTGCXZAT CCCTGCTGTT - 120 

10 

ACAAGCCAGA GGAGCCCX3GA TGTGAGGCCC CAGATCACCT CCftGGGACTT GGGGTrTCCCA 180 

TCTGAAATCC TTTATTTTTG TACCATGGGG TGGGCCCCGG GCTGAGAAGG AAGAAGCACXT 240 

15 CTCTCCCCGG CCTCCTCTGT CTGCACCCGT GGGGCTGTGA CTTACTCCTG CCTCCAGGGG 300 

CGGGGCGGGG CCCCCTGGGA CCTCTTAAGG CCCAAGGTGG QCCCCAGGAC CTYTGGGCAG 360 

AC?K3GAYTCC TCATQGCAGA TGTGTGQCAA TGrPCTGGCTG WGTCTTTCCG GCAMCTGCGT 420 

20 

YCCCTYTCCC GGGYTCCCCT GCTGCATGGT GGATGTGCTC CTTCCTGGCC CGGTCACATT 480 

GCCTCCTTGA GCCTTAGTCC AGGGGGTCAC TYCTCCCACC CCACCTACCT CACAGGGTTG 540 

25 TTCTGAGGGT GCACAGAGGA GCAAAGTCCC TGAAGGCCCT CAGGCAGTAT ATAGGQGCCG 600 

CCCACCTTCA GCTGCCCTGG GATGGGAAGG ACCCAGCCCG ACCCXTTOGGC ATAACACTGT 660 

GTTTGCAAAT GGAGATTCAG GrCATTGGGGA TGCAGGTTGT GGGGAGCTGG CXTTOGCAGAG 720 

30 

TAGGQGTAGT TGGCTTTGGCC TTCTCTTTGG TGATCCCACX: CCCAGCCATT TGCATTGCTG 780 

GCCX:AGCGCC TGGCCTGGGG GGCGGGGAGA GGCAGCAGAA GGGGCTOGGC AGGGGCGGrPG 840 

35 GAGGACTCAG GAACraXCG GGGAGAGTGG GTATQGCGGC TGAGCCAGGG GCCCTCCTGT 900 

GTrTCAcrrc COGGGATGGG TCCTTCCTTC TCAGCTGTGT CCGACCCCAC CATGTAATAA 960 

AACCCAAAGG AACAGCAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAN 1020 

40 

CXX33GGGGGG GNCCCG 1036 
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(2) INFORMATION FOR SEQ ID NO: 87: 



(i) SBQXJENCE CHARACTERISTICS: 

(A) LENGTH: 908 base pairs 
50 (B) TYPE: nucleic acid 

(C) STRANDEECNESS: double 

(D) TOPOLOGY: linear 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 
TTAAACAAAT QGAATCATGC AATATGTGAC CTTTTGCGTC TGGCTTATTT TATTTAGCAT 60 
AATCTTTTTG AGGTTCATCC AAGCTGTAGC ATGTATCAGC ACCTCATTTC TTTTTCTQGC 120 



60 



TGAATATTAT TOCATTATAT GGATTTACCA CAATTCATTT ACCTATTCAT CTTTTGTrTC 



180 
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TGCTGTCTGG CTATTGTGAA TAATGCTTCG ATAAACATTC ATATACAAGT TTCTATGTGG 
CTTTATGTTT TCATTTCTCT TOGCTATCTA CATGGGAGTA GflATTCTAGG TCATAATATA 
ArrTTATGTT TAACTTCTCA AAGAATTGCC AAAAGGTm TCATAGTGGC TGCATCATTT 
ACATTCCCAC CGGCAATGTA CAAGGATTTC TATTTTTCCA TATCCTTGCA CTTACCAACA 
CTiClTrrrK GTIWATWATTT ICTTTTTTCA TTATTGCCAC CCTAGTQGAT GrrGAAATGQC 
ATCTTATTGT TTTGATTTGC ATTTCTCTAA TGACAAATGA TATCATACTT TTTTTATGTG 
CTTACGGATC AAAGGTATTT CCTTGGAGAA ATGTCCXTTC AAGTCCITTG CXATTTCAAA 
ATTTGGTTAT TTGTCTTTTA TTATTCAGTT 1TAAGAAATT CTGGCXIAGGC GCAGrTGGCTC 
ACCTGTAATC OTAGCACTTT GGGAGGCCAA GGCGGGCAGA TCACTTGAGK TCAGGACTTC 
GAGACCAGCC TGGCCAACAT GGTGAAACCC CATCTTACTA AAAATACAAA AATTAGCTGG 
GCX?rGGTC3GC AGGTGCATGT AATCNTATCT ACTCAGGAGG CTGAGGCAQG AGAATCGCTT 
GAACCXAGGA GGCGGAGGCT GCAGTGAGCX: AAGATCACGC CATTGCACTC TAGCCTGGGT 
GACACAGA 

(2) INFORMATION FOR SBQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 655 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGlCi linear 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 88: 
TGCACTGGTT CCTTCTCCCC AGCAAATACT GCCTTCTTGT TTTTCTCTGA TGTGGCAGGT 
GACTACAAAA TCCGCCTTGG TATTCTTCAA ATGCATATAT ATTCCTTTCT TGTCAGCTCC 
CTCTCTTCCT AGATTAGAAA ACTGCCTCAT TTTCTGCTCA CTGGATGTGC AGTCCCAGCT 
TGTCTTCCTC TCCTCCCCCC CTGTTOCAGG TGTTCTTTTT TTTTTTCTTC TCTCCCCACT 
GGGCAGCAAA AGTTGTTCCA CAGTGGAAAW TTAGGCATCC TCAAGTTTCY TCCCAGCTTC 
TGCTGTCTTT TCTTAGAGTA AATTGCCAAT TTCTGTTTOT ACAOGAAATC CTTTTTTAAA 
AATGGAATCA GTGTGGTCCC CATCTACTCT GCAAAAATTG CATTTTTCTC TATTTTCAAA 
TGAGATTTGT TCAAGTTTCA AAACCACGTG AAATAATAAA TGTATAGTAG TTTTCTTTTC 
CTTGGGCATT GCTWGATATG TGAAATGGGT TTATGAAAAA TAATAAAATC ATAACGCTAT 
TTGrrrTGACT TTCAATTTCA TGGGAATTTT TCTCAGCTAA ACTCTAAATG GTGATTARGC 
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AAAAAAAAAA AAAAAAAACY GRAGGGGGGC CCGGTACCAA TTCGCCCTAT AATGA 



655 



(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1102 base pairs 
10 (B) TYPE: nucleic acid 

(C> STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 

15 

' rmrrrrrr accatttaaa ataaaatgaa agtgaccttc tgtttataaa aatctttgtc 60 

TGCATCTCTC CTTATTTCCT TAGAAGAGAT TCCAAGAAGC GGTGAGTGAT TTCACGGCAG 120 

20 CAGAGGGrrrc GGACATATTA CGGGCGCGGA TCCCTCTTGG AGTGAGATGA CTCTCCGGAG 180 

AGATTTAGrrC GTC^^CCCTCG CGTGTGAGGC TGCGTCACAC CCCAGGGATG TGTCTATCAA 240 

GATGGAAGAT CTTTTACACG CTCTTGATTT TGTTTGSCTY TTTTTCTATT ACTAGTGAGA 300 

AKGAAACTTT TTATATGATT ATTATCCATC ATAATCCAAC ACAAATTACT GCTTCATGTT 360 

CTTTTACTTT CCICTGAAGG TrTTAGTGCC TTTTAAAAAT TGCTATATAT TAAGcrivrr 420 

AATACTTCCA TCCTCTATTT GTC3GSCATCA RTTTCCCCGG GNACAGGCNT GCACATTTTG 480 

CCTTCACACG CTGGGTQGTT TTTCATTTTC AtTTTCTATTT CTCGTTCTTC TATCGTTTTA 540 

TCTTCAGACG GGTTTCTCCG TGTAGAAAGC AGTTTATGAA GATTTACTTT CGACAGrrCTT 600 

CTCTCTACTT TCTACAGTGA ATTCTCTCAT GTGTCTGGGA GTTTGGGGGT CTGGGTAAGA 660 

RTCCTCCTCT CACCCTATTC TCTATTACGA TCCACAGCCT CATGCTTTAT GARATTGGTG 720 

GCCGGGARCG QGGGAGATTT GCGGATCCCC CAAGCCAGAC TTTATCCCCC TATCCCTGCC 780 

TCTCGATCCC ACGTACAGGC CTGQGAACTC CCTGTGGGTA GGGGCCAATG GTCTCGCACT 840 

CTCflCCTCTA CCCCAGCX3CT GGCACAGGAT GGTCAAGGAG AGAGGCTGCC CAAGOGCATC 900 

CYTCTGCnXTT CCCCCTGACA CGCCTCCAAA GTGAGCAGCT AGGTTTCAAC AGCCCCACGT 960 

TOCAGC7K3GG AGATGAAGCT CAGGGTGGAG ACCAGTATCT CACAGTTCTC TTTGCATGGC 1020 

50 CGGGTACTTG TTAGTCAACT GATCAAGTGA AAATTCTAGC CCCAGAGGCA GGAGAATCCG 1080 

GAACAAAATT AAACCAGCCA GG ^-102 
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(2) INFORMATION FOR SEQ ID NO: 90; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1533 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDMESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 

GGCACGAGCC OJCACGGGCA GCGCCCCATA GCGCCAGGGA CCCCCTGGCA GCGGGAGCCG 60 
CGGGTCGaCG TTATGGATCC AGCGGGCQGC CCCCX5GGGCG TGCTCCCGCG GCXTCTGCCGG - 120 

TOJCTGGTGC TGCTGAACCC GCGCGGCGGC AAGGGCAAGG CCTTGCAGCT CTTCCGGAGT 180 

CACGTGCAGC CCCTTTTGGC TGAGQCTGAA ATCTCCTTCA CGCTGATGCT CACTGAGCGG 240 

CGGAACCACG CXXX3QGARCT GGTGCGC?rCG GAGGAGCTGG QCCGCTGGRA CGCTCTGGTG 300 

GTCATGTYTG GAGACGGGCT GATGCACGAG GTGGTGAACG GGCTTCATGG AGCGGCCTGA 360 

CTGGGAGACC GCCATCCAGA AGCCCCTGTG TAGCCTCXXA GCAGGCTCTG GCAACGCSCT 420 

GGCAGCTTCC TTRAACCATT ATGCTGGCTA TRAGCAGGTC ACCAATGAAG ACCTCCTGAC 480 

CAACTGCACG CTATTGCTGT GCXX3CCGGCT GCTGTCACCC ATGAACCTGC TGTCTCTGCA 540 

CACGGCTTCG GGGCTGCXXX: TCTTCTCTGT GCTCAGCCTG GCCTGGGGCT TCATTGCTGA 600 

TGTGGACCTA GAGAGTGAGA AGTATCGGCG TCTGGGGC3AG ATGOGCTTCA CTCTGGGCAC 660 

CTTCCTGCGT CTGGCAGCCX: TGCGCACXTTA CCGCGGCXXA CTGGCCTACC TCCCTGTAGG 720 

AAGAGTGQGT TCCAAGACAC CTGCCTCCCC CGTTGTQGTC CAGCAGGGCC CGGTAGflTGC 780 

ACACCnCTG CCACTGGAGG AGCCAGTGCC CTCTCACTGG ACAGTOCSTGC CXGACGAGGA 840 

CTTTGTGCTA GTCCTGGCAC TGCTGCACTC GCACCTGGGC AGTGAGATGT TTGCTGCACC 900 

CATC3GGCXX3C TGTGCAGCTG GCGTCATGCA TCTGTTCTAC GTGCGGGCGG GftGTGTCTCG 960 

TGCCATGCTG CTGCGCCTCT TCCTGGCCAT GGAGAAGGGC AGGCATATQG AGTATGAATG 1020 

CCCCTACTTG CTEATATGTGC CCGTOGTCGC CTTCCGCTTG GAGCCCAAGG ATGGGAAAGG 1080 

TGTGTITGCA GTCGATOGGG AATTGATGGT TftGCGAGGCC GTGCAGGGCC AGGTQCACCC 1140 

AAACTACTTC TOGATOGTCA GCGGTTQCC3T GGAGCCCCCG CCCAGCTGGA AGCCXXAGCA 1200 

GATGCCACCG CCAGAAGAGC CCTTATGACC CCTGGGCCGC GCTGTGCCTT AGTC3TCTACT 1260 

TGCAGGftCCC TTCXTCXTTTC CCTAGGGCTG CAGGGCTTGT CCACAGCTCC TGTQGGGGTG 1320 

GAGGAGACTC CTCTGGAGAA GGGTGAGAAG GTGGAGGCTA TGCTTTQGGG GGACAGGCCA 1380 

GAATGAAGTC CTQQGTCRQG AGCCCAGCTG GCTQGGCCXIA GCTQCCTATG TAAGGCXTTTC 1440 

TAGTTTGTTC TGftGACCCCC ACCXXIACGAA CCAAATCCAA ATAAAGTGAC ATTCCCAAAA 1500 

AAAAAAAAAA AAAAAAAAAA MitOCCCaXSGG OGG 1533 



60 
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(2) INFORMATION FX>R SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERXSTICS : 

(A) LENGTH: 575 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTICN: SBQ ID NO: 91: 
ATCCTCTQGA ATCTAGGTGG AAGCCACCAA GCCTTCTTCA CACTTGCGTT CTGAGCATCT 
GCAGACTTAA CCCCATGTGG CAATCACCAA GGCTTATGGC TTGTGTCCTC CAGAACTGTG 
QCCAGAGCTG TACCTGGGCC CCTTTGAGCT GAGGCTGAAG 0CAGAC3TCTG AAGCTCAGCA 
GGGCAGTARG GCCCTGQGCC TGGCCCCTGA AACCATTCTT TTCTCCTAAG CCTCTGGGCC 
TTTGATQGGA RQQGCICTCC TCAAGATTTT TGAAATGCCT TTGGAGGGTT TTTGCCTTGT 
CTTGGATATT GGCTTCCTTT TAGITATQCT CATCTCTCTA GCAAGTGAAT GnTTCACAAC 
CTGCTTGGAT TCTTTCTCTA CCACAGARCC AGGCTGCAAA TTTTACAAAC TTTTACACTC 
TGTTTCCCTT TTAAATATAA ATTTCAATGT TAAGTCACrT CTTTGCTCCC ATATCTGATT 
TAGGTTQCTG GAAGTAGCCA AGTCACCTCT TGAATGCTTT GCTGCTTAGA AATTTCCTCT 
ACTAGGTAGC CTGGGTCATC ACACTTAAGT TCAAA 

(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARAQTERISTICS : 

<A) LENGTH: 639 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIFTION: SEQ ID NO: 92: 
TCCTTTCATC TTAAGCACCA CCCGACAGGG CAGGTACTAT TACCATCTCC GriTTGACAGA 
TNAGGAACCT GGCACAGGAA GCATTTAAGT GGATTCCCCA QGATOGCCCC ACTGTCAGGA 
GCAGANTCAG AATQGGCCTC AGCATCAGGC TCCCAATCCT GGCTTCTAAC TGCTGCQCTC 
TGCCCTTCYC TCWCCCCACC TCCCCACTCC AGTGCCTITG GTCATGCCAC TGCAGCTTTC 
AGGCCAATAC TGGATTAGCC TCTTAGTGTT CTTGTCCCTG CAGCCATTTC CCCAGGCAGC 
AATTCCATGT GCCCTCACTG ATGTAGGTGG CTCTTGTGTC ATTTGTCACA TCCTATTQAA 
TTGTTTATGC ATCTTGTTCA CACTCACAGC ACCCTCCCTC TCACACGTCC TCCTTATAAA 
AATGTCCCTC AGTCTCPGCT ATGAGCCAGG TGCAGACTTA AGTGACAGGG CTGCTACGGG 
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AAATAAAAAA TTAACAAGGA GCACCTGCCT CTTAATGCAC AGTAACAAAC TATGTTAAGT 
GTCAGGAAGG AAAGGTTAAG GATGCCflGGA AGGCTTTTAA TAAATAACCT GACTTAGATG 
GGCAGGTGGT GCTGARGATT AflGAACGTGT TCTTCTCGA 

(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 744 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEE3NESS : double 

(D) T0P0UX3Y: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 
GAATTCGGCA CGAGAGTQGC TGGAGTCTGG CTGCAGAGGG AAGACATCAG CAGGGAGGGA 
GCCAQGGCCT GTCACATCTT TCCTCTGGCC ATTGTCCTGG TCTTTGTAAG CCCAGAATCT 
CCCCTTCCCT GAAGGGAGGC CAGCACCCCA GGAGGGCAGC AGGTGTGCTG TGAGGGTTGG 
AGTAGTGTGA GAGGTCAGGG TACACTAGAA TGGCCATGGA CACCATGTQG GGGTGCTCTG 
GGCTGGGCCA CAGAACAGTG TCCTTCCTGC TGCTCCTCCC CTGCAGCTTC CCCCGACCTT 
GTNCSTTTATT TGGTTTGATA CCAATCAGCA GACCCTGCAA GGTGGAfiGCT CCCAQGCTCT 
CAGTCCCACS ACTCTCATGT GCCAGTCACC CNTACTC3TAA CTGCCCAATG AGTACTTCTT 
GCCCACTGCC AAGATAGAGC CAGTTTACCA AGACAGGGGA ATTGCAGTAG AGAAAGAGTT 
GAATATACAT AGAGCCAGCT AAATGGGAGA GTQGAGTTTT CTTATTACTT AAATCAGCCT 
CCCTTAAAAT TCAGAGGTGA GAATTTTTCA AGGACAGTTT GGTTGGSCAGG CCTAGGGAAT 
GGATC3CTGCT GATTGGCTAG GGATGCAATC ATAGGGGTGT AGAAAAGTWC CTTGrrOCACT 
GAGTCCACIT TTGGTGAGAG CTACCAAGGA GCTGCTGGTC TGCTGGTCCC GGTAGAQCCA 
TCTOGTGrrCA GGAATGCAAA AGTG 

<2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEC^JENCE CHARACTERISTICS: 

(A) LENGTH: 526 baise pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 
GCAGGGGAAT TCGGCCACGG AGGGGTTTCA ACAGGGCCCG TGGGGTGAGG TGCARACACA 
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AAGCCCATAA GTGCTGGCCT CTTTGGGACAA 
AGCGCAYTCA GCCATCYTAY TCCTGGGGAA 
5 GTTGTAAAAC TGGAAAAAAA TTTTAGAAGA 
TAAAATGTAG AAAACTAAAG CACAGAGATG 
GTTAGCARGC TTGGrKrTGGT GACCTTTCTA 

10 

CAGCACAGAT GGCTGCTGCT ATAGCTGGC3G 
CCCAAGTTCC CATAGTCTAG GTTCTGCTTC 
15 TCCCTTACCA CTCTACCAGT GCTGGGGGAT 



ATGAGAGAAA TCCCATAGGG TOGTGATGAC 120 

AATGAAACTT GTGCTCCTAT CAAATGCTCA 180 

CATCTTGTCX: AGCATCTGTG TTTATGTCTA 240 

TTAAATGTTT TGTCCAAGGT CXDAACAGCTG 300 

CTGAACCACA GTCCCGCrGG GGGAAGTCCT 360 

TATGGGCAGT ATTAGTAGTT AACCAGTCAA 420 

AGCTGGAGGT TAGGGAAAAA CACAAGAAAA 480 

GTACTAAGAG ATCCCC 526 



20 (2) INFORMATION FOR SEQ ID HOz 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LQIGTH: 426 base pairs 
<B) TYPE: nucleic acid 
25 (C) STRANDEENESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 

30 GGCACAGGGC AGGAGAGACT TGGTCCATGG GGAGAAGCCT GCAGTATAGA TGQGACCPCC 60 

AGGAGCCCAA GTAGCATAGA CCCTGCTGAT CCGGGGCCAT TGAGCCAGAG GATTTGQGCT 120 

GAATGTCCCC AGAGACAAAA GGGAAAGGTA GATCCTTTCC CTTAAAGATG AAAGCCATOG 180 

35 

CCCQGGCTTG CTTATTGCTC TCTCTCCTGG TCCTTCCACA TGrrTGTTTCT GAACATTPGT 240 

TCTGGCATCA CAATCCCCGT CATCCTGTCA TCTGGCCCTT CCCACCTTTC CACCTTATCT 300 

40 CTTGCAGTGT CTCCGCGTCG ACCTGGCACC TGGGTGAARG CTTGCTCTTG CPGGTGCCCA 360 

TAGCCCCCAG TGTATGGTCT TGAMCTCCCC AGCCATATGG ARACCCACCT' CAGGAGGGCC 420 

CCTCGA 426 

45 



50 



(2) INFORMATION FOR SEQ ID NO: 96 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 844 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: double 
55 (D) TOPOIOGY: linear 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 
GGCACAGCGG CACGAGATAG GAAGCTTGGC AGGGGCAGCT CCCCCAGTOC GCATTGCCCT 60 
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GTAACTCGAG CGCCTGQGAG TGGGGAGAGG 
TCTTCTCCTG CTCATCCCAG GCCTCCTCCA 
CCCAGCCCAA GGAACAACTG AGAATACTGA 
CCTGGC5CAAA GTGAGGTCAC TGGATTCAAA 
AGCACCTGTA TATAACTGCX: AGCCTCTGCT 
TTTGOCACCT GTCTCTGTCC TCCCCATTCT 
CCTGTTAGTT CAGCAAATGT TCATCGAGCT 
AGATTCTGC3I CTTGCAAGGG TGAGACAAGT 
GGTCTGCTCA ACAACTTTGC ACTGAATTGT 
TTCTTTAAGG GTAGTCCAGC AAGGTGGCAA 
TCAGTGAAGC ATTTGGGGST GCTAGCTCTG 
TCTACTTCCA COTGCCCCCC CATQCCAGGC 
GCAGAAAGGA GCXACCTGGT TTATGCTTCG 
TTTT 



CTTGGAAATG GAGCAGGGTG GTGGACCTCG 
TAACACCTTAC CTAGCACGGC CTGGGGACTT 
GTGCCAGQGT AGCCCTAGCC CXATTTCACA 
CACTCflGATT TAAACCTCCT CTGTGTCTGC 
GCXXCTCTCC AAAAAGTCTC TGCCCTTGTC 
CTGCTCCTCC TTTCTCCAAC TCAGANTCAC 
CCATAATGTA GCAGGACAGG NCTGTCTAAC 
ACTCTCCATC TTTCTCTCAT CTTCACAGAT 
AAATAATTGA TACTGCATAA AACATTGATG 
GTCTTATAAT GATAACTGCT CAAGGATCTC 
CCTATGGGTG AQGTCAGCTA TCTCACGCCA 
TCACCCTGAG CTGAGATGCC TGAGCAGGTG 
GGACCACAAA CTCCTCTATC CAGANGACAG 



(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHAEIACTERISTICS: 

(A) LENGTH: 1985 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEBNESS: double 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 
AGCCCTGCTG AflGT!ACAGGT TCTTCTATCA GrrTCTGTTG GGCAATGAAC GAGCAACAGC 
AAAGGAGATC AGGGATGAAT ATCSTGGflGAC GCTGAGCflAG ATTTACCTGT CTTACTACOG 
CTCTEACCTG GGGCCSGCTCA TGAAGGTGCA GTATGAGGAA GTCGCTGAGA AAGATGATCT 
AATGQC3TGTG GAAGATACAG CAAAGAAAGG ATTCTYCrCA AAGCCATCGC TCCGCAGCAG 
GAACACCATT TTCACCCTAG GAACCCGCQG CTCTGTCATC TCCCCCACTG AACTTGAGGC 
CCCCATCCTG GTGCCTCACA CAGCGCAGCG GNAGAGCAGA GGTATCCATT TGAGGCOCTC 
TTCCGCAGCC AGCACTACGS CCTCCTAGAC AATTCCTGCC GCGAATACCT TTTCATCTGr 
GAATTTTTTG TTGTCTCTGG CCCAGYTGCA CACGACCTGT TCCATGCTGT CATGQGCCGT 
ACACTCAGCA TGACCCTGAA ACACCTGGAT TCTTATCTAG CTGACTGCTA CGATGCCATT 
GCTGTTTrTC TCTGTATCCA CATTGTTCTC CGGTTCCGTA ACATTGCAGC AAAGAGGGAT 
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GTPCCPGCCC TQGACAGGTA CTQGGGAACA GGTGCTPGCC TTGCTATGGC CACGGTTTGA 660 

ACTGATCCTG GAGATGAATG TTCAGAGCGT CCX5AAGCACT GACCCCCAGC GCCTAGGGGG 720 

5 

GTK3GATACT CGGCCCCACT ATATCACACG CXXXTATGCA GAGTTCTCCT CCGCTCTTGT 780 

CAGTATCAAC CAGACAATTC CTAATGAACG GACCATGCAA TTGCTGGGAC AGCTGCAGGT 840 

10 GGAGC3TGGAG AATTTICTCC TCCGAGTGGC AGCTGAGTTC TCXTCAAGGA AGGAGCAGCT 900 

TGTGrrTTCTG ATCAACAACT ATGACATGAT GCTGQGTGTG CTGATCGAGC GQGCTGCAGA 960 

TGACAGCAAA GAGGTTGAGA GCTTCCAGCA GCTGCTCAAT GCTCQGACAC AGGAATTCAT 1020 

15 

TGAAGAfiTTG CTGTCTCCCC CTTTTGQGGG TTTAGTQGCA TTTGTGAAGG AGGCTGAGGC 1080 

TTTGATTGAG CGTTGGACAGG CTGAGOGACT TCGAGGGGAA GAAGCXTGOG TAACTCAGCT 1140 

20 GATCCGTGGC TTTGGTAGTT CCTGGAAATC ATCAGTOGAA TCTCTGAGTC AQGATGTAAT 1200 

GCGGAGTTTC ACCAACTTCA GAAATGGCAC CAGTATCATT CAGGGAGCGC TGACCCAGCT 1260 

GATCCAGCTC TATCATCGCT TCCACCGGGT GCTGTCCCAG CCGCAGCTCC GAGCCCTCCC 1320 

25 

TGCCCGGGCT GAGCTCATCA ACATTCACrA CCTTATGGTG GAGCTCAAGA AGCATAAGCC 1380 

CAACTTCTGA TGTGCCAGAA ACCGCCCTGA GATCTGCCGG TCATCTCCAT GGACTTCPGC 1440 

30 ACCCCATTCC ATACCCTTCT TCACCTGGGG TACCCCTTCC AGrTTTTCCCC TTGCTTCCCA 1500 

GGCCCTTGAC ATQGCTTACC TGCCTTCACT CXXAGCACCT TGCCCAACAG GATAAGCTQG 1560 

ATCCCCTTGG CXTTTCTGAAT ATCCCAGrrGT CTTGAGGrTTT CCCAAGACCA CTTCCCICTG 1620 

35 

GGCTTCCAAA ATGGCCTTTA 1X3VTTTCTCC AGTCTGTCAC CCTCCTTTCC TGCTCCCATA 1680 

CACCCAAGGC TTGTTTCTTC CCCTGTAAAA ACCACTGCCT CAATCTCTGG TTCACTCAAC 1740 

40 TAGTCACCAT GTCCTGAQGC ATGAASCCTC CTCAGCTCTT QGAATTGCTG GCAAGGGGTC 1800 

ACTGCCTCTG AGTCATTGTG TTTTTCAAAG TGATTTCTTT TCTGTAGCTT TTTGACCTAA 1860 

GATCTCAGCA AOTTGAACAC TAACCTCTCC CCTCCTGGCT CAAGAATTAC TCCGAAGTCA 1920 

45 

GTCTGCAGAA AATAAATATT TAGTATGACA TGAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1980 

AAAAA 1985 

50 

(2) INFORMATION FOR SEQ ID NO: 98: 

55 (i) SEQUENCE CHARACTERISTICS: 

(A) LQX?ni: 1416 base pairs 

(B) TYPE: nucleic acid 

(C) STEU^NDECNESS: double 

(D) TOPOLOGY: linear 

60 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

ATATGAAGGG AAAGAATTTG ATTATGTTTT CTCAATTGAT GTCAATGAAG GTGGACCATC 60 

ATATAAATTG CCATATAATA CCAGTGATGA CCCTTQGTTA ACTGCATACA ACTTCTTACA 120 

GAAGAATGAT TTGAATCCTA TGTTTCTGGA TCAAGTAGCT AAATTTATTA TTGATAACAC 180 

AAAAGGTCAA ATGTTGGGAC TTGGGAATCC CAGCTTTTCA GATCCATTTA CAGGTQGTGG 240 

TOGGTATGTT CCGGGCTCTT CGGGATCTTC TAACACACTA CCCACAGCAG ATCCTTTTAC 300 

AGC3TGCTGGT CGTTATGTAC CAGGTTCTGC AAGTATGGGA ACTACCATGG CCGGAGTTGA 360 

TCCATTTACA GGGAATAGTG OCTACCGATC AGCTGCATCT AAAACAATGA ATATTTATTT 420 

CCCTAAAAAA GAGGCTGTCA CATTTGACCA AGCAAACCCT ACACAAATAT TAGGTAAACT 480 

GAAGGAACTT AATQGAACTG CACCTGAAGA GAAGAAGTTA ACTGAGGATG ACTTGATACT 540 

TCTTGAGAAG ATACTGTCTC TAATATGTAA TAGTTCTTCA GAAAAACCCA CAGTCCflGCA 600 

ACTTCAGATT TTGTGGAAAG CTATTAACTG TCCTGAAGAT ATTGTCTTTC CTGCACTTGA 660 

CATTCTTCGG TTGTCAATTA AACACCCCAG TGTGAATGAG AACTTCTGCA ATGAAAAQGA 720 

AGQGGCrCAG TTCAGCAGTC ATCTTATCAA TCTTCTGAAC CCTAAAGGAA AGCCAGCAAA 780 

CCAGCTGCTT GCTCTCAGGA CTTTTTGCAA TTGTTTTOrT GGCCAGGCAG GACAAAAACT 840 

CATGATGrrCC CAGAGGGAAT CACTGATGTC CCATGCAATA GAACTGAAAT CAOQGAGCAA 900 

TAAGAACATT CACATTGCTC TGGCTACATT GGCCCTGAAC TATTCTGTTT GmTCATAA 960 

AGACCATAAC ATTGAAGGGA AAGCCCAATG TTTGTCACTA ATTAGCACAA TCTTGGAAGT 1020 

AGTACAAGAC CTAGAAGCCA CTTTTAGACT TCTTGTGGCT CTTGGAACAC TTATCAGTGA 1080 

TCATTCAAAT GCTGTACAAT TAGCCAAGTC TTTAGGrGTT GATTCTCAAA TAAAAAAGTA 1140 

TTCCTCAGTA TCAGAAOCAG CTAAAGTAAG TGAATGCTGT AGATTTATCC TAAATTTGCT 1200 

GTAGCAGTQG GGAAGAGGGA GGGATAITrT TAATTGATTA GTUmTi'lT CCTCACATTr 1260 

GACATGACTG ATAACAGATA ATTAAAAAAA GAGAATACGG TGGATTAAGT AAAATTTFAC 1320 

ATCTTGTAAA GTCGTGGGGA GGGGAAACAG AAAXAAAATT TTTGCACTGC TGAAAAAAAA 1380 

AAAAAAAAAA AAAAOGAAAC TCGAGOGGGG GCCCG6 1416 



55 



60 



(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTEEIISTICS : 

(A) LENGTH: 1935 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEOflESS: double 

(D) TOPOLOGY: linear 
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(xi) SEX3UEHCE DESCRIPTION: SEQ ID NO: 99: 

NTCTACCCTA ATCAAGATGG GGACATACTT CGCGACCAGG TTCTTCATGA ACATATCCAG 60 

5 

AGATTGTCTA AAGTAGTGAC TGCAAATCAC AGAGCTCTTC AGATACCAGA GGTTTATCTT 120 

CGAGAAGCAC CATGGCXATC TGCACAATCA GAAATCAGGA CAATAAGTGC TTATAAAACC 180 

10 CXXXX3GGACA AAGTCCAGTG CATCCTGAGA AICTGCTCTA CGATTATGAA CCTCCTGAGC 240 

CTGGCXIAATG AGGACTCTGT CCCTGGAGCG GATGACTTTG TTCCTGTGTT GGTGTTTGTG 300 

TTGATAAfiGG CAAATCCACX! CTGTTTGCTG TCTACTGTGC AGTATATCAG TAGCTTTTAT 360 

15 

GCTAGCTGTC TGTCTGGAGA GGAGTCCTAT TGGTQGATGC AGTTCACAGC AGCAGTAGAA 420 

TTCATTAAAA CCATCGATGA CCGAAAGTTGA CCAAGACCAA GGCCCACCAA GGCAGCAGAC 480 

20 TGTTAATCAG ACAAACAGAT CTCTGAGAAG GTGCATCAGC TGCTTTGAAG GCTGAAGATT 540 

GTTTTGTATG ATACTGCACA GCATCAQGCA TTTTAAAGCA GATCTTTACT AAACAQGTTA 600 

ATGAGCTAAC AAGCAGCTTTC TCrCGTCTTT GGGCTCTTTC CTTTCTGAGT TGCATATTCT 660 

25 

ATTTTCTTGT CCXXAAGTAG AGACTAGTAC TACAAAAAGG GACCACATTT TTCAAGTATT 720 

TCTAAGTATA AAAAACAAAA CAAAAATCTC TTAGGAAATG TCTAGACCTC CATTCTTGGA 780 

30 TTCCCTTTCT TTCCTrTTAT TTTAAAAAAG AACAGTACCC CTCTTTTAAG ATOCTGrCTT 840 

ACATTAATGA GCftTCTAATG GAAAGAAGGT ATGAGTTGCA CTGAGGAirrA GAATAGTCGT 900 

GCXSTTAGrrOG CATTATCTAT AAATACACTC ACCTAAATTC AAAGCTAAGA AGGAAATGTA 960 

35 

AATATAATAT ATATTTATAT TTGATGTAAT ATGGACATCT GCAGATTCTA ATAAACAAGG 1020 

ACTATTGCTG ATAGTAGGCT GTSACATACT GTCTTGrPGAA AJrGGTTTCCT TGACAAAATT 1080 

40 TAAGCTGAGC TTAAAAGCAA AAAAACAAAA AGTACACAGA AATATTTATT AAAAaXSTAAT 1140 

ACAGTTTATT GAACnTCTA GGTATGGAGT TTGATGGACA GGGCIGCCTY TAATGAGTGT 1200 

GAAGGTCACT AAGTCACTTA GACATCTCAC CX3TGGAAGTT TCTGAGCCTG CATTAGGAGA 1260 

45 

TAGACTGATT ACCATACATG ACATAAAAAG GAACAGTGGA TAGCTCATAC TTTATGGTCG 1320 

rPCTTCTCCT CCGAAATAAT ATACTGCAGA AATCCCAGAC AGAGCTCCTT ACAAACCTTT 1380 

50 AATTGTAATA TAITTTTGAT GATTATTCAC ATTGAATGCA CAGACCAAGA ATTCAGTGAA 1440 

TGrcATTTTT TAAAAAACTA ATTTGTATTG TCTGCTCTAG TGATACAAGT TTTACTAGTC 1500 

ATAAACTATT TTAATCAACC ATACTATTCT TATGGAAAAA AATATCTATT TTQGCAGGTT 1560 

55 

TCrcrrGCCTT TATTTCCCTC TTCTGAAAAA AAGTCTGTOT TTTCATAGTT TGGTTTGCAT 1620 

TGTATATCAA TAATTAATCA GGAATGGGTT TTGGTGOCTG AAAAATTGGC CATGGAGGCA 1680 

60 CACCAAAGCT TCAAGCACAA GTCTTGTACA TGGGCCATCA CTGTCTGGriT TCACTTCGTG 1740 
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TGTTTCCTAA ACACATTTAG CTGCITITIT AACAAACTCA GCCCCATACT TGAGTCCCIT 1800 

GTTGTTGGGA GCATTTCCAG GCATCTTTTA AGQGAACTGT GACAAACAGC CTCGGGCAGA 1860 

TGAACACGGA GGCTCTCTGT TGTCTGTCTC TGAGATCTTT GTGTCTGGGA ATGCXTTAAAG 1920 

NTTTTCairTT TTTTT 1935 



10 



15 



20 



25 



30 



35 



40 



(2) INFORMATICaJ FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 599 base pairs 

(B) TYPE: nucleic acid 

(C) STEW^NDEENESS; dotible 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: lOO: 

GAATTCGGCA CGAGCGTCCA CQCAGCCGCC QGCCQGCCAG CACCCAQQGC CCTGCATGCC 60 

AGGrrOGTTGG AGGTGGCAGC GAGACATGCA CCCGGCCCGG AAGCTCCTCA GCCTCCTGTT 120 

CCTCATCCTG ATGGGCACTG AACTCACTCA ASACTCCGCT GCCCCCGACT CCCTC3CTGAG 180 

AAGTTCAAAG QGCAGCACGA GGGGGTCTTT GGCTGCTATT GTCATCTGGA GGGGGAAGAG 240 

TGAGAGCCGG ATAGCCAAGA CCCCAGGCAT TTTCAGAGGT QGCGGGACCT TftGTCCTACC 300 

CCCAACACAC ACCCCTGAGTr GGCTCATCCT CCCTTTGQGC ATAACGCTGC CCTTGGGGGC 360 

TCCAGAAACA GGCGGTGC3GG ATTGTGCCGC TGAGACCTQG AAGGGCAGCC AGCGTGCCQG 420 

CCAGCTGTGT GCATTGCTQG CTTAATATGC AGGGCTTGGG GGGCTGTGGC CACATGCCCG 480 

GCAGGAGGTG AGTGAGGAGC CCTGrTGOCGT GCTGGTGTGG GGATCGTCGG CATTTCAAAC 540 

TACCCTGAAC AATGTATCAA TAGAGAAAAA AAAAAAAAAA AAAACTCGA 599 



45 



50 



55 



60 



(2) INPORMATICN FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGrTH: 784 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTICN: SEQ ID NO: 101: 
GAATTCGGCA CAGAAAAAAA AGAGAGACTG GGTCTTACTG TGTTGCCCAG ACTTGTCTTG 
AACTCCTGCC TCAGCCTCTC AAGrEACTTGG GATTATAGGC CAAGAAGCCA CCATGCCTAG 
CTTCTTOCTG TCATTGATCC AGACTAATAC TCTGGGGTCA GCCTCATTTC ITCTCTTrCr 



60 
120 
180 
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CACTTTGCAC ATCCACTTGT CACCAAATCK 
GATTCCTCCA GTTGTTCATT AGTAATGrrCT 
CTGTCTTTCC KGCCTTCAGT CTTAACTTCT 
GATCAWATTT TATTTAAAAA TACTTTACAW 
ATTCATQGAA AG?iAAAATCA CTGTCCCAAG 
GATTTTAATT TTTAAAAATG TATATTTTTT 
ACAWTCCXnT GTAAAGTCTC TAATTCTGTA 
GATATTTTAC AATTTCATTT ATCACCACCT 
TWACATATGC AGAAGTTTCT CXTAACAAAC 
CTGTTGCTTT CTTTCCCTTC ACAATCAAAT 
TCGA 



RGTTCATTCT GCATCCTAAG TAAGTCCTTT 
CAARTGTAAT TTTTTCTAGT AGTTTTCAGC 
CCAGTACATA KGCCACATTG TTGTCAGCAK 
AKGTTTATKG CCAAATATTA GRAAATACAG 
GAGGTCACTG QCATGGTGAG GTTAAGGGGT 
CCTGTGTAGA GTAGTAACAC CCTTGAAAAC 
CrcCGCATCT AGSTGRTCTC TTCTTTCTCA 
TTCTCTAGCC TTTACCCGTC TCTTCAATAT 
ACCTGCCTCT GCCTCAGTTC TGCTACCACC 
TTAAGAGTGT CAAAAAAAAA AAAAAAAAAC 



(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1035 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECNESS: double 

(D) TOPOIiOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 102: 
AGAGGCCTQG CTGCGTTGCC CTATCTCOGT CTCCGCCACC CACTTAGCGT TTTAGGCATC 
AATTACCAGC AGTITCTCCG CCACTATCTG GAAAATTACC CGATTGCTCC CGGCAGAATA 
CAAGAGCTTG AAGAACGCCG CftGTTGOGTG GAAGCCTGCA GAGCAAGGGA AGCAGCGTTT 
GATGCCGAAT ATCAGCGAAA TCCTCACAGG GTGGACCTCG ATATTTTAAC CTTTACGATA 
GCTCTGACTG CCTCTGAACT TATCAACCCT CTGATAGAAG AACTTGGTTG CGATAAGTTT 
ATCAATAGAG AATAGTTAGG TGGTGACACT ACTTCAAGAG AACCTCTGCA TTCCAGTCAT 
AOCAATCCTG CAACTTGATT TTCAGAAGTC AAGAGTATAT OGCGATAAGA CAGTGCACAG 
GTQGAGGGGA AAAAAAOGGG GAGGGGGAAG CTTATCTTGA AAAAGCATCA CAGAAGTAGA 
AAAAAATGTC GAAAGCATTA TAACTGTAAC GTTCmGAG TTTGTGATTG ATCCACATTT 
TTCCCCCTGC ATTATGGAAA ATGTCTCTCA GCATTGCTTT ATTACAAAGT AAAGGATGGT 
TTTATAAAAT TGAGACTGAT GAAACATCAA TACTAGAGCC CATGAGGATG AAAGAAATTA 
TCAAATAGTG CTGAACAGAA TAAGATGTTA ACGCTGAGTT ATTAGGACTG GAAGGCTATG 
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AAAAGAACTT GAAATTGTCG GAATATGTGC TCTCTTCATG TCATATTCAA TAGAAGTTTC 780 

TAGTTTAAGA TTGATTTTGT GTrTTCTTAG GCATTTCAAG TGACAAGCAA AGTAAATGTA 840 

TATATTATGT GATAAATCAT GTTTrCAAGA ACGTCAAATT TCTGGACTTT TTTCTTTCAA 900 

TTrrTAATTT TTAAAGTTTT TTTGGTATTA AAAAATCYAT TCACAAGCCA AAAAATWTWT 960 

WAAAWIWCM GCGAAAAGCC AAAAAAAAAA AAAAMMAGGG GGGGCCGGGC CCCATCCCCC ' 1020 

CAAGGGGGTC CNGNT 1035 



15 

(2) INFORMATION FOR SEQ ID NO: 103: 

(i) SEC^JENCE CHARACTERISTICS: 

(A) LENGrrH: 2218 base pairs 
20 (B) TYPE: nucleic acid 

(C) ' STRANDEEttQESS : doiible 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 

25 

AGCTPATTAGG CCCTTTTOTG QGAGCCCCAT GTTTTGTTTT TCTGAGTTGG TGGGGAGGGA 60 

SGGAGGGGGA GGGCTGAATT GTTTTGCftGA GGAAGATGGC ATCTGTQCTT TAAATTTCTC 120 

30 AITACTGGGT TAGAAAACAA AGAGGGAKTG CCCTGCACAT ' mvITriXJr GCTTTTAAAT 180 

GTTTCTTAAG TTGGAACAGG TTTCCTCGGG CCTGTTTTGA CTGATTGCTG GAGTGCATTT 240 

GATAGTTAAA AATTACTAAT TGOTTTTATT TCCCTTCACA CTCTGCCTCC CCACTTCTCC 300 

35 

CCCCGTTACT GAAAAATAAC CATTTTAGTG TCAGGCTAGA AATTGAATTG CTGAGTTTTG 360 

TGTATCCTTT AAATTAAAAA CCACAAGTGT TTATTGTACT QGTTAAACTG TAGCATCTCA 420 

40 GCATCTGGGT GGAAGCTGCC TATATTTCTT CCCAGTTTAA CTGGGGACCA TCTGTGAAAT 480 

TAATTTTCCA TCCAGACAGC TGCTGTGAGC AAATGAACAT AAATGCTCGC TGGAAATTTA 540 

CTAACCAGTT TTTATATTGA CCTQCAGTGT AAAAAGCACA TTTAATTATA AACAATATAT 600 

45 

TCAAAATQGG CAAATTTTAT TTTCAAATGC AGTOEAGAGC TAGATTAAAA GCAACTCTTT 660 

GCCACCTACT CTGCCCTTTT GGCAAAGTTA CCTTGAACAA AGAATCTTAA GGGTTTATTA 720 

50 AGAACTCTTT ATTTTCTTCA TACCCTGTrC TCTGCAGTCC TTTCTAACflG CTTCTGQGTG 780 

CAGATTTTCT TCGGCATCCT TTTGCACTCA GCTTATTACA GGTAGGTAGT GCTTAAGAAA 840 

AGTCATGGAG GACTAAAGCC TAAGTCCTTT TCACTTTTCC TCCATCTGAA GGTAGGTGAG 900 

55 

TTCATCCrCT TCATAGTAAT GCTGTTTTAC CAAGACTTTA TAGCAGATGG ACCCAGAAAG 960 

AATTTTCTGC TAITGTGTTC ACTACAACAG GATAGGGACA TCAGACAGCC CCAGAAAC3CC 1020 

60 CTTCCAGATC TGATATGGGA CTATTAATTT TTATGCTGTT AATTGGTATT CATTCACAAT 1080 
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GCAGTTGAAG QGGGAAGGCT CCACTGCATT CTTTCGCTAA GGCCTGAATG CTTGCTCATC 1140 

TGTAAGATCT ATACTCGAGG TTTTGTTTTC CTTTTAAAAT TCTTTAGGGA GAGAGGGATC 1200 

GTTTCTGAGG GC3TTCTGAAA GTATGATTCA ATGTGCAACA 'EACAGGTAGG TCTTCAGCAT 1260 

AAGCTGAAAT ATATGCATGT AAAAACTTTG ACATCTTTTT TTTTAATTTT CCACTTTCTT 1320 

CTTAACTTTA CTTCTCTTTT TGTCCCCCCC CXMCTTACA GAAGTTGAGG CCAAGGGAGA 1380 

ATGGTAGGCA CAGAAGAAAC ATGGCAAACT GCTCTGTGCT TTCAAACCAA AGTGTTCCCC 1440 

CCAACCCCAA ATTTGTCTAA GCACTGGCCA GrTCTGTTGTC GGCATTGTTT TCTACAACCA 1500 

AATTCTGGGT TTTTTTCTTC TTTCTTTAAA CATAGAGGTA CCACXIACAAG GGATCCCCTA 1560 

CTCTCTOGCA GCTCTTGAAA GCATCTGTTT GAGGGAAAGG TCTCTQGGCA AGCAAGTOGT 1620 

TATTTGGATT GCTTGCTTCC CTTTTTCCAC CTGGGACATT GYAATCATAA AATAACAGTA 1680 

AATTCCAAAC CTCAAAAACT ATTATGGCCT GAQCACAGCT GAAATCTAGC AGAGTTEAAC 1740 

TCTTCTGCCT CCATGTCTGT CACTTATAAT TCAGGTTCTG CrGTTGGCTT CAGAACATCA 1800 

GCAGAAGAAT CGTTTrAlGC TAGTTATTGC ATTCATGGTT GAAACTCAAC TTAQGGAAAG I860 

GGTTCCAATG TATTAAGCAA TGGGCTGCTT CTCCCCAATC CTOXTAACA ATTCGTTGTC 1920 

TGGACTTCTC ATCTAAAAGG TrAGTGGCTT TTGCTTGGGA TCAGTGCTCT CTATTGATGT 1980 

TCTTGCTGGT CTCCAGACAC ATTCCTGTTG CATTAAGACT TGAAAGACTT GTAGATCTCT 2040 

GATGTTCAGG CACAGGATGC TGAAAGCTAT GTTACTATTC TTAGTTTGTA AATTGTCCTT 2100 

TTGATACCAT CATCTTGTTT TCTTTTTCrrA GGTATAAATA AAAACACTGT TCACAATAAA 2160 

AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAA 2218 



45 



50 



55 



60 



(2) INFORMATICN FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1351 base pairs 

(B) TXPE: nucleic acid 

(C) STRANDEENESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 
CTTCACAGAC TGACAGAATG GTTTTGrrPr GrmVITlT GTmVlTrr GTTTTTCAGA 
TGGACTCTAG CTCTGTCACC CAGGCTGGAG TGCAGTGGTG CGATCTCGGC TCACTGCAAG 
CTCCGCCTCC CGGGTTCTCA CCATTCTCCT GCCTCAGCCT CCCGAGTAGC TGGGACTACA 
GGCGCCCACC ACCACGCCCG GCTAATTTTT TGTArTTTTT AGTAGAGACG GGGTTTCACC 



60 
120 
180 
240 



wo 98/54963 



PCT/US98/11422 



10 



15 



20 



40 



45 



50 



55 



355 



ATGTTAGCCA GGATGGTCTC GATCTCCTGA CCTCGTGATC CX3CCCGCYTC GGCXTTCCCAA 300 

AGrTGCTCGGA TTACAGGCGT GAGCCACCGT GCCTGCCCCA GAATCGTTTT TAAAGCCACA 360 

GTTGAGARGC CACCCATTGC CCGGCGCCTG GACAGTGATC ATCTTCTTCA TCTTCTPCAG 420 

TCCTTTCPTG TGrTGATTGGA ATTATTCATC CCCTTTGAAA GATGAGAAGG TTGAGATCCA 480 

AAGAGTCTAC CTTTCCAAGT TCTCACTGCT GGAAAGARCT AGAAGCACAG TTCAAAGITC' 540 



600 
660 



TGGmrCTQG ACTCTGCAGT CCAGGTYTCC CTTYTCCCAC TTGCCTACCX: TCAATCCCAC 
ACTGTTTTTG AAGTGGCCCA TAACTTGAAG GRAAAGTTTA AAGACAGTTC AATTTAATCA 
TCAGRATGCA TTCTTTTTTT TTTCGGARAC GGAKTTTCAC TCTTGCTCCC CASGCTCGAG 720 
TGCAATGGTG CAATGATCTC GGCTCACTGC AACCTATGCC TCCTGGGTTC AAC2«aOTAT 780 
CCAGCCTCAG CCTCCCGAGT AGCTGGGATT ATGGGCGCCC ACCACCATGC CCAGCTAATT 
TTrGTATTTT TTTTTTTAGT AGAGATGGGG TTTCGCCAGG TTGGCCAGGC TCEeTCTTOIG 
AAYTCCTGGC YTCAGGrGAT YTGCTCACYT CATCYICCAA AAGTGCIGGG ATTACAGGCA 
25 TGAGCCACTG CGCCTGGCYT CAGAATGCAT TCTTACACAT CTATCCTAGA CATTTATAAG 

CACTCTAATG GATAACAATC CAAGAATAAA TGATTGTAAA AGATGATCCC GAAGACTTGA 1080 
TGTCAATCTT TTTTTCCTAA GAAAAAAAGT CCGOGAGTAT TAAATATTTA GATCAATOIT 1140 
TATAAAATGA TTACTTTGTA TATCTCATTA TTCCTATTTT GGAATAAAAA CTCACOTTCT 1200 
TTAATCATAT ACTTGTCTTT TGTAAATAGC AGCTTTTGriG TCATTCTCCX: CACTITATTA 1260 
35 GTTAATTTAA ATTGGAAAAA ACCCTCAAAC TAATATTCTT GTCTGTTCCA GTCTTATAAA 1320 
TAAAACTTAT AATGCATGTA AAAAAAAAAA A i^Sl 



840 
900 
960 
1020 



60 



(2) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2066 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

GGCACGAGGC GGCGGAGGGC CACAATCACA GCTCCGGGCA TTGGGGGAAC CCGAGCCGGC 60 

TGCGCCGGGG GAATCCGTGC GGGCGCCTTC CGTCCCGGTC CCATCCTCGC CGCGCTCCAG 120 

CACCTCTGAA GTTTTGCAGC GCCCAGAAAG GAGGCGAGGA AGGAGGGAGT GTCTCAGAGG 180 

AGGGAGCAAA AAGCTCACCC TAAAACATTT AITTCAAGGA GAAAAGAAAA AGGGGGGGCG 240 

CAAAAATGGC TGGGGCAATT ATAGAAAACA TGAGCACCAA GAAGCTGTGC ATTOTTCGIG 300 
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GGATTCTGCT CGTGTTCCAA ATCATCGCCT TTCTGGTGGG AGGCTTGATT GCTCCAGGGC 360 

CCACAACGGC AGTGTCCTAC ATGTCXX3TGA AATGTGTGGA TGCCCGTAAG AACCATCACA 420 

5 

AGACAAAATG GTTCGTGCCT TGGGGACCCA ATCATTGTGA CAAGATCCGA GACATTGAAG 480 

AGGCAATTCC AAGGGAAATT GAAGCCAATG ACATCGTGTT TrCTGTTCAC ATTCCCCTCC 540 

10 CCCACATGGA GATGAGTCCT TGGTTCCAAT TCATGCTGTT TATCCTGCAG CTGGACATTG 600 

CCTTCAAGCT AAACAACCAA ATCAGAGAAA ATGCAGAAGT CTCCATGGAC GTTTCCCTGG 660 

CTTACCGTGA TGACX5CATTT GCTGAGTGGA CTGAAATGGC CXIATGAAAGA GTACCACGGA 720 

15 

AACTCAAATG CACCTTCACA TCTCCCAAGA CTCCAGAGCA TCAGGGCCGT TACTATGAAT 780 

GTGATGTCCT TCCTTTCATG GAAATTGGGT CTGTGGCCCA TAAGTTTTAC CTTTTAAACA 840 

20 TCCGGCTGCC TGTGAATGAG AAGAAGAAAA TCAATGTGGG AATTGGGGAG ATAAAGGATA 900 

TCCGGTTGGT GGGGATCCAC CAAAATGGAG GCTTCACXIAA GGrrGTGGTTT GCCATGAAGA 960 

CCTTCCTTAC GCCCAGCATC TTCATCATTA TGGTOTGGTA TTGGAGGAGG ATCACCATGA 1020 

25 

TGTPCCCGACC CCCAGTGCTT CTGGAAAAAG TCATCriTGC CXOTGGGATT TCCATGACCT 1080 

TTATCAATAT CCCAGTQGAA TGGTTTTCCA TCGGGTTTGA CTGGACCTGG ATGCTGCTGT 1140 

30 TTGGTGACAT CCGACAOGGC ATCITCTATG CX3ATGCTTCT GTCCTTCTGG AraVTCTTCT 1200 

GTQGCGAGCA CATGATGGAT CAGCACGAGC G6AACCACAT TGCAGQGTAT TGGAAGCAAG 1260 

TCQGACCCAT TGCCGTTGGC TCCTTCTGCC TCTTCATATT TGACATGTGT GAGAGAGQGG 1320 

35 

TACAACrCAC GAATOXTTC TACAGTATCT GGACTACAGA CATTGGAACA GftGCTGGCCA 1380 

TGGCCTTCAT CATCGTGGCT GGAftTCTGCC TCTGCCTCTA CTTCXTPGrTTT CTATGCTTCA 1440 

40 TGGTATTTCA GGrTGTTTCGG AACATCAGTG GGAAGCAGTC CAGCXTTGCCA GCTATGA3CA 1500 

AAGTCCGGCG GCTACACTAT GAGGGGCTAA TTTTTAGGTT CAAGTTCCTC ATGCTTATCA 1560 

CCITGCCCTG CGCTGCCATG ACTGTCATCT TCTTCATCGTr TAGTCAGGTA ACGC3\AGGCC 1620 

45 

ATTQGAAATG GGGOGGCGTC ACAGTCCAAG TGAACAGTQC CTTTTTCACA GGCATCTATG 1680 

GGATGTQGAA TCTOTATGrC TTTGCTCrrGA TGTTCTTGTA TGCACCATCC CATAAAAACT 1740 

50 ATGGAGAAGA CCAGTCCAAT GGAA3X5CAAC TCCCATGTAA ATCGAGGGAA GATTGTGCTT 1800 

TGrrTTGTTTC GGAACTTTAT CAAGAAnCT TCAGOGCTTC GAAATATTCC TTCATCAATG 1860 

ACAACGCAGC TTCTGGTATT TGAGTCAACA AGGCAACACA TGTTTATCAG CTTTGCATTT 1920 

55 

GCAGTTGTCA CAGTCACATT GATTGTACTT GTATACGCAC ACAAATACAC TCATTTAGCX: 1980 

TTrATCTCAA AATGTTAAAT ATAAGGAAAA AAGCGTCAAC AATAAATATT CTTGAGEATA 2040 

60 AAAAAAAAAA AAAAAAAAAA AAAAAA 2066 
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(2) INFOEMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LHKTTH: 1705 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : doijble 

(D) TOPOIOGY: linear 

(xi) SEQUENCE DESCRIPTION: SBQ ID IK): 106: 

AATTCGGCAK AQGGCAGCTG TCGGCTGGAA GGAACTGGTC TGCTCACACT TGCTGGCTTG 60 

CGCATCAGGA CTGGCTTTAT CTCCTGACTC ACGGTGCAAA GGTGCACTCT GCGAACGrTTA 120 

AGTCCGTCCC CAGCGCTTGG AATCCTACGG CCCCCACAGC CGGATCCCCT CAGCCTTCCA 180 

GGTCCTCAAC TCCCGYGGAC GCTGAACAAT QGCCTCCATG GGGCTACAGG TAATGGGCAT 240 

OGCGCTQGCC GTCCTGGGCT GGCTGGCCGT CATGCTGTGC TGCQCGCTQC CCATCTrOGCG 300 

CGTGACGGCC TTCATCGGCA GCAACATTGT CACCTCGCAG ACCATCTGGG AGGGCCTATG 360 

GATGAACTGC GTGGTGCAGA GCACCGGCCA GATGCAGTGC AAGGTGTACG ACTCGCTGCT 420 

GGCACTGCCG CAGGACCTGC AGGCGGCCCG CGCCCTOGTC ATCATCAGCA TCATCGTGGC 480 

TGCTCTGGGC GTGCTGCTGT CCGTGGTGGG GGGCAAGTGT ACCAftCTGCC TQGAGGATGA 540 

AAGCGCCAAG GCCAAGACCA TGATCGTGGC GGGCGTGGTG TTCCTGTTCSG. CCGGCCTTAT 600 

QGTGATAGTG COGGrTGTCCT GGACGGCCCA CAACATCATC CAAGACTTCT ACAATCCGCT 660 

GGPGGCCTCC GGGCAGAAGC GGGS^TQGG TGCCTCGCTC TACGTCGGCT GGGCCGCCTC 720 

CGOiCTGCTG CTCCTTGGCG GGGGGCTGCT TTGCTGCAAC TGTCCACCCC GCACAGACAA 780 

GCCTTACTCC GCCAAGTATT CTGCTGCCCG CTCTGCTGCT GCCAGCAACT ACGTGTAAGG 840 

TGCCACGGCT CCACTCICTT CCTCTCTGCT TTGTTCTTCC CTGGACTGAG CTCAGCGCAG 900 

GCTGTGACCC CAGGAGGGCC CTGCCACGGG CCACTGGCTG CTGQGGACTG GGGACTQGGC 960 

AGAGACTGAG CCAGGCAGGA AGGCAGCAGC CTTCAOCCTC TCTOGCCCAC TCGGACAACT 1020 

TCCCAAGGCC GCCTCCTGCT AGCAAGAACA GAGTCCACCC TCCTCTGGAT ATTGQGGAGG 1080 

GACGGAAGTG ACAQGGTGTG GTGGTGGAGT GGGGAGCTGG CTTCTGCTGG CCAGGATGGC 1140 

TTAACCCTGA CTTTGGGATC TGCCTG C ATC GGTGTTGGCC ACTGTCCCCA TTTACATTTT 1200 

CCCCACTCTG TCTGCCTGCA TCTCCTCTGT TGCGGGTAGG CCTTGATATC ACCTCTQGGA 1260 

CTGTGCCrrG CTCACCGAAA CCCGCGCCCA GGAGTATGGC TGAGGCCTTG CCCACCCACC 1320 

TGCCTQGGAA GTGCAGAGTG GATQGACGGG TTTAGAGGGG AGGGGCGAAG GTGCTGTAAA 1380 
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CAGGTTTGGG CAGTGGTGGG GGAGGGC3GCC AGAGAGGCGG CTCAGGTTGC CCAGCTCTGT 
GGCCTCAGGA CTCTCTQCCT CACCCGCTTC AGCCCAGGGC CCCTGGAGAC TGATCCCCTC 
TGAGTCCTCT GCCCCTTCCA AGGACACTAA TGAGCCTGGG AGGGTGGCAG GGAGGAGGGG 
ACAGCTTCAC CCTTGGAAGT CCTGGGGTTT TTCCTCTTCC TTCTTTGTGG TTTCTGTTTT 
GTAATTTAAG AftGAGCTATT CATCACICTA ATTATTATTA TTTTCTACAA TAAATGGGAC 
CTGTGCACAG GRAAAAAAAA AAAAG 

(2) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE C3IARACTERISTICS : 

(A) LENGTH: 1167 base pairs 

(B) TYPE: nucleic acid 

(C) STRANnEDNESS : doxible 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 
TGCAGGAATT CGGCAGAGGT TTTCCGCTAG ACTCTGGCAG TTGGTGAGCA TCATGGCftAC 
CGTTACAGCC ACAACCAAAG TCCOGGAGAT CCGTGATGTA ACAAGGATTG AGCGAATCGG 
TGCCCACTCC CACATC0C3GG GACIGQGGCT GGACGATGCC TTGGAGCCIC GGCAGGCTTC 
GCAAGGCATG GTQQGTCAGC TGGCGGCACG GCGGGCGC5CT GGCGTOGrTGC TGGAGATGAT 
COGGGAAGGG AAGftTTGCCG GTCGGGCAGT CCTTATTGCT GGCCftGCCGG GCACGGGGAA 
GACGGCCATC GCCATGGGCA TGGCGCAGGC CCTQGGCCCT GACACGCCAT TCACAGCCAT 
CGCCGGCAGT GAAATCTTCT COCTCGAGAT GAGCAAGACC GAGGCGCTGA CGCAGGCCTT 
COGGCGGTCC ATCGGCGTTC GCATCAAGGA GGAGACGGAG ATCATCGAAG GGGAGGTQGT 
GGAGATCCAG ATTGATCGAC CAGCAACAGG GACGGGCTCC AAGGTGQGCA AACTGACCCT 
CAAGACCACA GAGATQGAGA CCATCTACGA OCTOOGCACC AAGftTGATTG AKTOCCTGAC 
CAAGGACAAG GTCCAGGCCG GGGACGTGAT CACCATCGAC AAQGCGACQG GCAAGATCTC 
CAAGCTGQGC CQCTCCTTCA CACGCGCCCG CGAACTACGA CGCTATQGQC TCCCAGACCA 
AGTTOGTGCA GTGOCCAGAT GGGGAGCTCC AGAAACGCAA GGAOGTQGTC CACftCCGTGT 
CCCTGCACGA GATCGACGTC ATCAACTCTC GCACCCAGGG CTTCCTGGCG CTCTTCTCAG 
GTGACACAGG GGAGATCAAG TCAGAAGTCC GTGAGCAGAT CAATGCCAAG GTGGCTGAGT 
GGCGCGAGGA GQGCAAGGCG GAGATCATCC CTQGAGTGCT GTTCATCGAC GAGGTCCACA 
TGCTGGACAT CGAGAGCTTC TCCTTCCTCA ACCGGGCCCT GGAGAGTGAC ATGGCGCCTG 
TCCAGCAGGT CTATGGGQAT GCCGTGAGGG CTCTGGTAGC TGGTGCCCCG GATTCGCGTG 
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ATGCCACGGT TGGTOGCCTC CTTGCCGAATT CCTGCMXXC GGGGGATCCA CTAGTTCTAG 
AGOQGCCGCC ACCGCGGTGG ANCTCCN 

(2) INFORMATION FOR SEQ ID NO: 108: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1907 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: double 

(D) TOPOLOGV: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 
GGCACAGGGG AATCATCGTG TGATGTGTGT GCTGCCTTTG TGAGTGTC3TG GAGTCCTGCT 
CAGGTGTTAG GTACAGTC5TG TTTGATCGTG GTC3GCTTGAG GGGAACCCTT GTTCAGAGCT 
GTGACTGCGG CTGCACTCAG AGAAGCTC5CC CTTQGCTGCT CGTAGCGCCG GGCCTTCTCT 
CCTCGTCATC ATCCAGAGCA GCCAGTGTCC GGGAGGCAGA AGGTACCGGG GCAGCTACTG 
GAGGACTGTG CGGGCCTQCC TGGGCTGCCC CCTCCGCCGT GGGGCCCTGT TGCTGCTGTC 
CATCTATTTC TACTACTCCC TCCCAAATGC GGTOQGCCCG CCCTTCACTT GGATGCTTGC 
CCTCCTGGGC CTCTOGCAGG CACTGAACAT CCTCCTGGGC CTCAAGGGCC TGGCCCCAGC 
TGAGATCTCT GCAGTGTGTG AAAAAGGGAA TTTCAACGTC GCCCATGGGC TGGCATGGTC 
ATATTACATC QGATATCTGC GGCTGATCCT GCCftGAGCTC CAGGCCCGGA TTCGAACTTA 
CAATCAGCAT TACAACAACC TGCTACGGGG TGCAGTGAGC CAGCGGCTGT ATATPCTCCT 
CCCATTGGAC TGTGGGGTCC CTGATAACCT GAGTATGGCT GACCCCAACA TTCGCTTCCT 
GGATAAflCTG CCCCAGCASA CCGGTGACCG TGCTGGCATC AAGGATCGGG TTTACAGCAA 
CAGCATCTAT GAGCTTCTGG AGAACGGGCA GCGOGCGGGC ACCTGTGTCC TQGAGTACGC 
CACCCCCTTG CAGACTTTGT TTGCCATGTC ACAATACftGT CAAGCTGGCT TTAGCGGGGA 
GGATAGGCTT GAGCAGGCCA AACTCTTCTG CCGGACACTT GAGGACATCC TGGCAGATQC. 
CCCTGAGrCT CAGAACAACT GCCGCCTCAT TGCCTAOCAG GAACCTGCAG ATGACAGCAG 
CTTCTCGCTG TCCCAGGAGG TTCTCCGGCA CCTGCGGCAG GAGGAAAAGG AAGAGGTTAC 
TGTGGGCAGC TTGAAGACCT CAGCGGTGCC CAGTACCTCC ACGATGTCCC AAGAGCCTGA 
GCTCCTCATC AGTGGAATGG AAAAGCCCCT OCCTCTCCGC ACGGATTTCT CTTGAGACCC 
AGGGTCACCA GGCCAGAGCC TCCAGTGGTC TCCAAGCCTC TGGACTGGGG GCTCTCTTCA 
GTGGCTGAAT GTCCAGCAGA GCTATTTCCT TCCACftGGGG GCCTTGCAGG GAAGGGTCCA 
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GGACTTGACA TCTTAAGATG CGTCTTGTCC 
CTCGGTGTCT TCAACCTGTG AAATGGGATC 
TTGTGAGGAC TGAGTGTGTG GAAGTTTTTC 
CJTGTGCCAGG TOrCTTPCAT GGGGCXTETCC 
TTGCCCQGGG ACGCCGAACT CTCTCAATGG 
TGGTCATGTT CCATTATTGG GGAGCCCCAG 
TTTGGGGTAT TGAATCCCCC GGCTCCCACC 
TGCCX3GGCAA CTCTTGCCTA ATCATGACTA 
TGGCCCCTTA AQCCTACCTG TGTATCGGCA 
TTGCGGTTTC CTTATACTCC ACCCCTTTCT 
TTAAAAAAAA AAAAAAAAAA AAAAAAAAAA 



CCTTGGGCCA GTCATTTCCC CTCTCTGAGC 
ATAATCACTG CCTTACCTCC CTCACX3GTTG 
ATAAACTTTG GATGCTAGTG TACTTAGGGG 
AGACCCACTC CCCACCCTTC TCCCCTTCCT 
TATCAACAGG CTCCTTCGCC CTCTGCXTTCC 
CAGAAGAATG GAGAGGAOGA GGAGGCTGAG 
CTGCAGCATC AAGGTTGCTA TGGACTCTCC 
TCrCTAGGAT TCTGGCACCA CTTCCTTCCC 
CXXICCACCCC ACTAGAGTAC TCCCTCTCAC 
CAACXXTTCCT TTTTTAAAGC ACATCTCAGA 
AAAAAAAGGG CGGCCGC 



(2) INFORMATION FOR SEQ ID NO: 109: 

(i) SBQUQ3CE CHARACTERISTICS: 

(A) LENGTH: 611 base pciirs 

(B) TYPE: nucleic acid 
<C) STRANEEDNESS : double 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

ATGAATTAAC GCCAAGCTNT NAATAGGGAC TCACTATGGG GGAAAGNTGG GTAACGCCTG 

CAGGTACCGT TOCQGAATTC COQGGTCGAC CCACGCGTCC GATGGGGCTT TAGTAAATCA 

GGCTTGCAGG CTCAAAGCTG CAATCTGCCC ACTCTCAGGT ACTGAGACTT TGTGGGCCTC 

AGACACCAGG AAGAAAGTTG GGATACAGTC ATTTGAGTTA AAAAGGGAAT GACCCCTCflG 

AAACCCGCAT TAGCAGTGTT AdCTTOGAA GTGCCTTTAC TTTTAAOGCT CTCTGTTCTG 

AAAAAGAGGT GTTTGGTTAC GTGTGAGCCA ACATCACGTT TTGnTAGCTG TGATTTACCT 

TTGTCCCTITr AAAAGACTTC ACGGAGCCAT TCTGTATACA AGGTGTOCTC TTTCCAATGT 

AGAAGGGGTT ATGGAAAAGG GTGOGATCCT TTGCTGTAAA CTGGAGAGAC CAGTCCCAAA 

CAGAGGGGAA TTTTAAGCCC TTCTCATCAC CCAATTQGAT GTrTTTGCTT ATAGCAAATT 

CCTGCAAAAT AAATAAATAA ATATTTGCAA AACTAAAAAA AAAAAAAAAA AAAAAAAAAA 

GG0GGGIKX3^ C 
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(2) INPORMATIC»I FOR SEQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2632 base peiirs 

(B) TYPE: nucleic acid 

(C) STRANEEDNESS : doxible 

(D) TOPOLOGY: linear 

(xi) SEQUENCE E3ESCRIPTI0N : SEQ ID NO: 110: 
TCCCAGCTCT CAGGACAAGG GCCCTGGGCG ATCTTTTAAA AAAGCCGATT GGGTGTCTTT 
CTAAAAirrAC AACCAGTACT TCATCGTCAA GTTTCTGGGA AGGGAGTCCC CTCCAGATTC 
TCATGGAGTG ACAAATCTTG ACTCTTGCTC CTGGAATTTT TCAGGCCCAA ACTAGCGTTT 
CTACAATGAT TTATTTGGCA AAmCTCTT GATTATGGGT GGCTGATGAG GAACGTGCTT 
TTGTTAGGAA CCGAAACTQG GCGGCGCJTGA GGGOGTGTAC GCAATGAGTC CGGAAGAGQG 
TGAAATGCTT TCGGTAGGCA CTCCACQGCT GTGAAGATGG CGGCGGCTGC GTGGCTTCAG 
GTGTTGCCTG TCATTCTTCT GCTTCTGGGA GCTCACCCGT CACCACTGTC CJl-iTiTCAffT 
GCGGGACCGG CAACCGTAGC TOCTGCOGAC CGGTCCAAAT QGCACATTCC GATACCGTCG 
GGGAAAAATT ATTTTAGTTT TGGAAAGATC CTCTTCAGAA ATACCACTAT CTTCCTGAAG 
TTTGATGGAG AACCTTGTGA CCTGTCTTTG AATATAACCT GGTATCTGAA AAGCGCTGAT 
TGTTACAATG AAATCTATAA CTTCAAGGCA GAAGAACTAG AGTTGTATTT GGAAAAACTT 
AAGGAAAAAA GAGGCTTGTC TGGGAAATAT CAAACATCAT CAAAATTGTT CCAGAACTGC 
AGTGAACTCT TTAAAACACA GACCTTTTCT QGAGATTTTA TGCATCGACT GCCTCTTTTA 
GGAGAAAAAC AGGAGGCTAA GGAGAATGGA ACAAACCTTA CCTTTATTGG AGACAAAACC 
GCAATGCATG AACCATTGCA AACTTGGCAA GATGCACCAT ACATTTTTAT TGTACATATT 
GGCATTTCAT CCTCAAAGGA ATCATCAAAA GAAAATTCAC TGAGTAATCT TTTTACCATG 
ACTGriTGAAG TGAAGGGTCC CTATGAATAC CTCACACTTG AAGACTATCC CTTGATGATT 
TrTTTCATGG TGATGTGTAT TGTATATGTC CTGTrPGGTC TTCTGTGGCT GGCATOGTCT 
GCCTGCTACT GGAGAGATCT CCTGAGAATT CAGTTTTGGA TPGGTGCTGT CATCTTCCTG. 
GGAATGCTTG AGAAAGCTGT CTrCTATGCG GAATTTCAGA ATATCCGATA CAAAGGARAA 
TCTGTCCAGG GTGCTTTGAT CCTTGCAGAR CTGCTTTCAG CAGTGAAACG CTCACTGOCT 
CGAACCCTGG TCATCATAGT CAGTCTGQGA TATGGCATCG TCAAGCCAOG CCTQGAGTCA 
CTCTTCATAA GGTrGTAGTA GCAGRAGCCC TCTATCTTTT GITCTCTQGC ATGGAAGGGG 
TCCTCAGAGT TACTQGGGCC CAGACTGATC TTGCTTCCTT GGCCTTTATC CCCTTGGCTT 
TCCTAGACAC TGCCrTGTGC TGGTGGATAT TTATTAGCCT GACTCAAACA ATGAAGCTAT 
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TAAAACTTCG GAGGAACATT GTAAAACTCT CTTrGTATCG GCATTTCACC AACAOSCTTA 1560 

nrrOGCAGT GGCAGCATCC ATrcrrcTTTA TCATCTGGAC AACCATGAAG TTCAGAATAG 1620 

5 TGACATGTCA GrTCGGACTGG CGQGAGCTGT GGGTAGACGA TGCCATCTGG CX3CTTGCTGT 1680 

TCTCCATGAT CCTCTTTGTC ATCATGGTTC TCTQGCXaCC ATCTGCAAAC AACCAGAGGT 1740 

TTGCcrrrrc accattgtct gaggaagagg aggaggatga acaaaaggag cctatoctga " isoo 

10 

AAGAAAGCTT TGAAGGAATG AAAATGAGAA GTACCAAACA AGAACCCAAT GGAAATAGTA 1860 

AAGTTAACAA AGCACAGGAA GATGATITCA AGTGGGTAGA AGAGAATGTT CCTTCTTCTG 1920 

15 TGACAGATGT AGCACTTCCA GCCCTTCTGG ATTCAGATGA GGAACGAATG ATCACACACT 1980 

TTGAAAQGTC CAAAATQGAG TAAGGAATGG GAAGATTTGC AGTTAAAGAT GGCTACCATC 2040 

AGGGAAGAGA TCAGCATCTG TGTCAGTCTT CTGTACGGCT CCATGGGATT AAAGGAAGCA 2100 

20 

ATGACATCCr GATCTGTTCC TTGATCTTTG GGCATTGGAG TTGGCGAGAG GTGTCAGAAC 2160 

AAAGAGAACA TCTTACTGAA AACAAGTTCA TAAGATGAGA AAAATCTACG AGCTTCTTAT 2220 

25 TTACAACACT GCTGOXCCT TTCCTCCCAG ACTCTGACAT GGATGTTCAT GCAACTTAAG 2280 

TGTGTTOTTC CTGAACTTTC TGTAATGrTTT CATTTTTTAA ATCTGACAAA CTAAAAAGfrT 2340 

TAACGTCTTC TAAAAGATTG TCATCAACAC CATAATATCT AATCTCCAGG AGCAACTGCC 2400 

30 

TGTAATTTTT ATTTATTTAG GGACTTACAT AGGTGATGGG GGAAATTGTT AACTACCTTT 2460 

CATTTTCCTG GGAAGTCAAG GrTTACATCTT GCAGAGGTTG TTTTGAGAAA AAAGGGCXXTT 2520 

35 TCTGAGTTAA GGAGCCATAG TTCTATCAAT GATCAAAAGA AAAAAAAAAA AACTCGATCG 2580 

GCACGAGGGG GGGCXXX3GTA CCCAATTCGC CXTEATGOGAN TCGAATGAGA CX: 2632 

40 

(2) INFORMATIOM FOR SEQ H) NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 2249 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECNESS : double 

(D) TOPOIjDGY": linear 

50 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 

GAATTCGGCA CGAGCTCACC GTGCTGCGTG ACACAAGGCC AGCCTGCGCC TACGAGCCCA 60 

TQGACTTTKT RATQGCCCTC ATCTACGACA TGGTACTGSW TGrTGGTCACC CTGGQGCTGG 120 

55 

CCCTCTTCAC TCTGrTGCGGC AAGTTCAAGA GGTQGAAGCT GAAOGGQGCC TTCCTCCTCA 180 

TCACAGOCTT CCTCTCTGTG CTCATCTGGG TGGCCTGGAT GACCATGTAC CTCTTOGQCA 240 

60 ATGTCAAGCT GCAGCAGGGG GATGCCTGGA ACGACCCCAC CTTGGCCATC ACGCTOGCGG 300 
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CXAGOGCTGG GTCTTCGTCA TCTTCCACGC CATCCCTGAG ATCCACTGCA CCCTTCTGCC 360 

AGCCCTGCAG GAGAACACGC CCAACTACTT CGACftCGTCG CAGCCCAGGA TGCGQGAGAC 420 

5 

GGCCITOGAG GAGGACGTGC AGCTQCCGCX5 GGCCTATATG GAGAACAAGG CCTTCTCCAT 480 

GGATGAACAC AATOCAGCTC TCCGAACAGC AGGATTTCCC AACGGCAGCT TGGGAAAAAG 540 

10 A0CCAGTC3GC AGCTTOGGGA AAAGACCCAG aSCTCCXTTTT AGAAGCAACG TGTATCAGCC 600 

AACTGAGATG GCCGTCGTGC TCAACGGTGG GACCATCCCA ACTGCTCCGC CAACJTCACAC 660 

AGGAAGAMAC CTTTGGTGAA AGACTTTAAG TTCCAGAGAA TCAGAATTTC TCTTACCXSAT 720 

15 

TTCCCrCCCT GGCTGTGrrCT TTCTTGAGGG AGAAATCGGT AACAGTTGCC GAACCAGCXr 780 

GCCTCACAGC CAGGAAATTT GGAAATCCTA QCCAAGGGGA TTTCGTGTAA ATGTGAACAC 840 

20 TGACGAACTG AAAAQCTAAC ACCGACTGCC CGCCCCTCCC CTGCXZACACA CACAGACACG 900 

TAATACCAGA CX3VACCTCAA TCCCCGCAAA CTAAAGCAAA GCTAATTGCA AATAGTATTA 960 

GGCTCACrGG AAAATGrTGGC TQGGAAGACT GTTTCATCCT CTGGGGGTAG AACAGAACCA 1020 

25 

AATTCACAGC TQGTOGGCCA GACTGGTGTT GGTTGGAGGT GGGGGGCTCC CACTCTTATC 1080 

ACCTCPCCCC AGCAAGTGCT GGACCCCAGG TAGCCTCTTG GAGATGACCG TTGCGITGAG 1140 

30 GACAAATGGG GACTTTGCCA CXXXXTTTTGC CPGGrrGGTTT GCACATTTCA GQGGGGTCAG 1200 

GAGAGTTAAG GAGGTTOTGG GTGGGATTCC AAGGrTQAGGC CCAACTGAAT CGTGGGGTGA 1260 

GCTTTATAGC CAGTAGAGCT GGAGGGACCC TQGCATGTGC CAAAGAAGAG GCCCTCTGGG 1320 

35 

TGATGAAGTTG ACCATCACAT TTGGAAAGrTG ATCAACCACT GTrCCTTCTA TGGGGCTCTT 1380 

GCrCTAGTGT CTATOGTCAG AACACAGGCC CXX3CCCCTTC CCTTGTAGAG CCATAGAAAT 1440 

40 ATTCTGGCTT QQGGCAGCAG TCCCTTCTTC CCTTGATCAT CTCGCCCTGT TCCTACACTT 1500 

AOGGgPgT A T CTCCAAATCC TCTCCCAATT TTATTCCCTT ATTCATTTCA AGAGCTCCAA 1560 

TGGGGTCTCC AGCTGAAANS CCCTCCX3GGA GQCAGGTTGG AAGGCAQGCA CCACGGCAGG 1620 

45 

TTTTCCGCGA TGATGTCACC TAGCAGGGCT TCAGGGGrTTC CCACTAGGAT GCAGAGATGA 1680 

CCTCTCGCTG CCTCACAAGC AGTICTCACCT CGGGTCCTTT COGTrGCTAT GGTGAAAATT 1740 

50 CXriGGATGGA ATGGATCACA TGAGGGTTTC TTGTTGCTTT TC5GAGGGTGT QGOGGAIEATT 1800 

TTGrrrTGGT TTTTCTGCAG GTTCCATGAA AACAGCCCTT TTCCAAGCCC ATTGTTTCTG 1860 

TCATGGmTC CATCTCTCCT GAGCAACSTCA TTCCTTTGTT ATTTAGCATT TCGAACATCT 1920 

55 

CX3GCCATrCA AAGCCCCXM* GnTCTCTGCA CiX?mX3GCX! AGCATAACCT CTAGCATCGA 1980 

TTCAAAGCAG AGTTTTAACC TGACGGCATG GAATCrKTAA ATGAGGGTQG GTCCnCTGC 2040 

60 AGATACTCTA ATCACTACAT TGCTTTTTCT ATAAAACTAC CCATAAGCCT TTAACCTTTA 2100 
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AAGAAAAAIX; AAAA_:^JGC?rrA G?:?rTTGGGG GCCGGGGGAG GACTGACXXX: TTCATAAGCX: 
AGTACGTCTG AGCTGAGTAT GrTTCAAT^-^.. .^^rCTTTTGAT iVTTTCTCAAA AAAAAAAAAA 
A.AAAAMCCCG GGGGGGGGCC C3CL^-CCT3G 

(2) INFCHMariCN ?C?. SEQ IT tX): 112: 

(i) SHQCZNCZ C-IP-J'ArrEPJISTrCS: 

(A) LE^3[^-:: 2153 base pairs 
(3) TYPE: nucleic acid 

(C) 3TaA:3:ZIKES3: dcuble 

(D) TOPGLCXri-: linear 

(xi) SEQUENCE I3SC?:i?TXaJ: S3Q ID NO: 112: 
GATACTATAA GGCAAGTGAC CTCACGGGrGC GCCGTTAGAC TAGTGGATCX: CGGGTGCAGG 
AATTOGGCAG AGCGCCGCCG GAGCCGAAGT GCTGGCGCCC CCGCGGCXX3C TGCCTCCGCG 
GANCCCAAAA TC:ATGAAA:?r CACC3TGAAG ACCCCGAAGA AAAGGAGGAA TTCGCCGTGC 
CCGAGAA^AG C^CCGTCGyS CAGTTTAAGG AAGAAATCTC TAAACGTTTT AAATCACATA 
CTGACXIAACT I'LTit/-^^"--^ TTTGCTGGAA AAATTTTGAA AGATCAAGAT ACCTTGAGTC 
AGCATC3GAAT TCVrSATGXA CTT.^CTG^TIC ACCTTOTCAT TAAAACACAA AACAGGOCTC 
AGGATCATTC AGCTCAGCLi ACAAATACAG CIGGAAGCAA TGTTACTACA TCATCAACTC 
CTAATAGTAA C?CTACA:rrr GGmCTGCTA CTAGCAACCC TTTTGGTTTA GGTC5GCCTTG 
GC3GGACTTGC AGG^TCTCS^ AGCrTGGGT? TGAATACtAC CAACTTCTCT GAACTACAGA 
GTCAGATGCA GCGACAACTT TTGTCTAACC CTGAAATGAT GGTCCAGATC ATGGAAAAVgC 
CCTTTGrrCA GPiGCATGCrc ITHC^J^JiSCCT GACCTGATQJ AGACAGTTAA TTATGGCCAA 
TCCACAAATG CVGCACmA TAC^JSAGAAA TCCCAGAAAT TAGTCATATG TTGAATAATC 
CAGATATAAT GAGACAAACG TTGGAACTTG CCCAGGAATC CAGCAATGAT GCAGGAGATO 
ATGAGGAACC AGGACCGA3C TTIGAGCAAC CTAGAAAGCA TCCCAGGGGG ATATAATGCT 
TTAAGGCGCA TGT3VCACAGA TASOCAGGAA CCAATGCTGA GTGCTGCACA AGAGCAGTTT 
GGTGGrTAATC CATITGCTTC CTTSGTCAGC AATACATCCT CTGGTGAAGG TAGTCAACCT 
TCCCGTACAG AAAATAG?£A TCGCTryXC AATCCATGGG CTCCACAGAC TTCCCAGAGT 
TCATCAGCTT CC?J3COGC2C TGCCAGCAC? GTOGGTOGCA CTACTGGTAG TACTGCCAOT 
GGCACTTCTG GQCAGAGTAC TACIGCGCCA AATTTGGTGC CTGGAGTAGG AGCTAGTATG 
nCAACACAC CAGGAATGCA GAGCTTGrrTO CAACAAATAA CTGAAAACCC ACAACTTATG 
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CAAAflCATGT TGTCTGCCCC CTACATGAGA AGCATGATGC AGTCACTAAG CCAGAATCCT 
GflCCTTGCTG CACAGATGAT GCTGAATAAT CCCCTATTTG CrCGAAATCC TCflGCTTCAA 
GAACAAATGA GACAACAGCT CCCAACTTTC CTCCAACAAA TCCAGAATCC TGATACACTA 
TCAGCAATGT CAAACCCTAG AGCAATGCAG GCCTTGTTAC AGATTCAGCA GGGTTTACAG 
ACATTAGCAA CGGAAGCCCC GGGCCTCATC CTAOQGTrTA CTCCTGGCTT GGGGGCATTA 
GGAAGCACTG GAGGCTCTTC GGGAACTAAT GGATCTAACG CCACACCTAG TGAAAACACA 
AGTCCCACAG CAGGAACCAC TGAACCTQGA CATCAGCAGT TTATTCAGCA GATGCTGCAG 
GCTCTTGCTG GAGTAAATCC TCAGCTACAG AATCCAGAAG TCAGATTTCA GCAACAACTG 
GAACAACTCA GTGCAATGGG A1TTTTGAAC CCrTGAAGCAA ACTTGCAAGC TCTAATAGCA 
ACAGGAGGTG ATATCAAIXX: AGCTATTGAA AGGTTACTGG GCTCCCAGCC ATCATAGCAG 
CATTTCTGTA TCTRGAAAAA ATGTAATTTA TTTTTGATAA CGGCTCTTAA ACTTTAAAAT 
ACCTGCTTTA TTTCATTTTG ACTCTTGGAA TTCTGTGCTG TTATAAACAA ACCXAATATG 
ATGCATTTTA AGGTGGAGTA CAGTAAGATG TGTQGGTrTT TCTGTATTTT TCTTTTCTGG 
AACAGTTGGGA ATTAAGGCTA CTGCATGCAT CACTTCTGCA TTTATTGrTAA TTTTTTAAAA 
ACATCACCTT TTATAGTTQG GTGACCAGAT ITiCTCCTGC ATCTGTCCAG TTTATTTGCT 
TTTTAAACAT TAGCCTATGG TAGTAATTTA raTAGAATAA AAGCATTAAA AAGAAGCAAA 
AAAAAAAAAA AAAAATTCCT GCGCCCGCGA ATTCTTCT 

(2) INFORMATION FOR SBQ ID NO: 113: 

(i) SBC^JEMCE CHARACTERISTICS: 

(A) LENGTH: 1043 base pairs 

(B) TYPE: nucleic cicid 

(C) STRANDECNESS : doxible 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPnON: SBQ ID NO: 113: 
CTGAAGTGTA TGTGGTGAGG AAGAAGAQQC TCCTACTGTA GACAGCCTTG TTCTACAGAT 
CCTCCCAGAA ATCTCTGGGC CAGGTGGAAC CCAGGGTrCAG AGAGGGATGG GAGAGAGGTT 
TAATTTTCCA TGATAAATAA AAATCTATAA AATAATAAAC AAGAGAAAAG AGATTGGAAA 
CAGCCAQC?rr GGAGCAGTGA GTGAGTAAGG AAACCTGGCT GCCCTCTCCA GATTCCCCAG 
GCrCTCAGAG AAGATCAGCA GAAAGTCTGC AAGACCCTAA GAACCATCAG OCCTCAGCTG 
CACCTCCTCC CCTCCAAGGA TGACAAAGGC GCTACTCATC TATTTGGTCA GCAGCTTTCT 
TGCCCTAAAT CAGGCCAGCC TCATCAGTCG CTGTGACTTG GCCCAGGTGC TGCAGCTGGA 
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RGACTTOGAT GGGTTTGAGG GTTACTCCCT GAGTGACTGG CTGTGCCTQG CTTTTGTOGA 480 

AAGCAAC3TTC AACATATCAA AGATOAATGA AAATGCAGAT GGAAGCTTTG ACTATGGSCT 540 

5 

CTTCCAGATC AACAGCCACT ACTGGTQCAA CRATTATAAG AGTTACTCGG AAAACCTTTG 600 

CCACGTAGAC TGrCAAGATC TGCTGAATCC CAACCTTCTT GCAGGCATCC ACTGCGCAAA 660 

10 AAGGATTGTG TCX3GGAGCAC QGGGGATGAA CAACTGGGTT AGAATGGAAG KTTQCACTGT 720 

TCAGGCCGGC CACTCTTCTA CTGGCTGACA GGATGCCGCC TGAGATKAAA CARQGTGCGG 780 

GTGCACCGTO GARTCATTCC AAGACTCCTG TCCTCACTCA RGGATTCTTC ATTTCTTCTT 840 

15 

CCTACTGCCT CCACTTCATG TTATTTTCTT CCCTTCCCAT TTACAACTAA AACTGACCAG 900 

AGCCOCAGGA ATAAATGGTT TTCTTGGCTT CCTCCTTACT CCCATCTGGA CCCAGTCCCC 960 

20 TGGTTCCTGT CTGTTATTTG TAAACTGAGG ACCACAATAA AGAAATCTTT ATATTTATCG 1020 

AAAAAAAAAA AAAAAAAACT CGA 1043 



25 

(2) INFORMATION FOR SEQ ID NO: 114: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LEWGIH: 703 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 

GAATTOGGCA CGAGTGCGCG GGCACCACGG OGGTTTTTCG ACGCTGGCGG TGGACGCAGG 60 

CACCATGGAC CACGGTTGCT GQGCGGATGG QGAGCGTCTA TQGTCAGTTG CCTTAGAAGT 120 

40 

GGTGftGATGG GAAGCTGCAG TTGGAAGACC CTGGAGGATG CCTGACAAGG GGATGTCTGA 180 

CACATGATTG GAGCTCTTTT TGAAATGTTT CTTQCOCTTC CTGGAGCAGA GGAGCCATTA 240 

45 TTTATGCAGG TACATCGAAG TCITTTGACC TCCATACAGT GATTATGCTT GTCATCGCTS 300 

GTQGTATCCT GGCGGCCTTG CTCCTGCTGA TAGTrGTCGT GCTCTGTCTT TACTTCAAAA 360 

TACACAACGC GCTAAAAGCT GCAAAGGAAC CTGAAGCTGT GGCTGTAAAA AATCACAACC 420 

50 

CAGACAAGC3T GTQGTGGGCC AAGAACAGCC AGGCCAAAAC CATTGCCACG GAGTCTTGTC 480 

CTGCCCTGCA GTC3CTGTGAA QGATATAGAA TGTGTGCCAG irrTGATTCC CTGCCACCTT 540 

55 GCTGTTGCGA CATAAATGAG GGCCTCTGAG TTAGGAAAGG TGGGCACAAA AATCTTCATG 600 

AGCAATACTT CTTAGTAGAT TGnTTCTTA TTCAAATCAA GTTCTAGTCT TTTTATGrTGA 660 

GATTATATAA TTTACAGTCST TCTTTTTATAT ACTTTTGAAT AAA 703 
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(2) INFORMATION FOR SEQ ID NO: 115: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LEMGTH: 3684 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : doiible 
10 (D) TOPOtOGY: linear 

(xi) SBQIUENCE DESCRIPTION: SEQ XD NO: 115: 

GGCAGAGGGG GCATGAGCAG GAGGAGGATT ACCGCTACGA QGTGCTCACG GCCGAGCAGA 60 

15 

TTCTACAACA CATQGTGC»IA ATGTATCCGG GAGGTCAACG AGGTCATCCA GAATCCAGCA 120 

ACTATCACAA GAATACTCCT TAGCCACTTC AATTGGGATA AAGAGAAGCT AATGGAAAGG 180 

20 TACTTTGATG GAAACCTGGA GAAGCTCTTT GCTGAGTGTC ATGTAATTAA TCCAAGTAAA 240 

AAGTCTCGAA CACGCCAGAT GAATACAAGG TCATCAGCAC AGGATATGCC TTGTCAGATC 300 

TGCTACTTGA ACTACCCTAA CTCGTATTTC ACTGGCCTPG AATGTGGACA TAAGmTOT 360 

25 

ATGCAGTGCT GGAGTGAATA TTTAACTACC AAAATAATGG AAGAAGGCAT GGGTCAGACT 420 

ATTTCGTGTC CTGCTCATGG TTGTGATATC TTAGTrGGATG ACAACACAGT TATGCGCCTG 480 

30 ATCACAGATT CAAAAGTTAA ATTAAACrrAT CAGCATTTAA TAACAAATAG CTTTGTAGAG 540 

TGCAATCGAC TGTTAAAGTG GTGTCCTGCC CCAGATTGCC ACCATGTTGT TAAAGTCCAA 600 

TATCCTGATG CTAAACCTGT TOGCTGCAAA TGTQGGOGCC AATTTTGCTT TAACTGTGGA 660 

35 

GAAAATTQGC ATGATCCTGT TAAATGTAAG TQGTTAAAGA AATGGATTAA AAAGrTCTGAT 720 

GATGRCfiGTG AAACCTCCAA TTQGATTGCA GCCAACACAA AGGAATGTCC CAAATGOCAT 780 

40 GTCACAATTG AGAAGGATOG TGGTTGTAAT CACATGGTCT GTCGTAACCA GAATTGTAAA 840 

GCAGAGTTTT GCTGGGTGTG TCTTGGCCCA TGGGAACCAC ATGGATCTGC CTGGTACAAC 900 

TGTAACOGCT ATAATGAGGA TGATQCAAAG GCAGCAAGAG ATGCACAGGA GCGOTCTAGG 960 

45 

GCAGOCCTGC AGAGGTACCT GTTCTACTGT AATOGCTATA TGAACCACAT GCAGAGOCTG 1020 

OGCTTTGAGC ACAAACTATA TQCTCAGGrTG AAACAGAAAA TGGAGGAGAT GCAGCAGCAC 1080 

50 AACATGTCCT GGATTGAGGT GCAGTTCCTG AAGAAGGCAG TTGATGTCCT CTGCCAGTGT 1140 

CGTQCCACAC TCATGTACAC TTATGTCTTC GCTTTCTACC TCAAAAAGAA TAACCAGTOC 1200 

ATTATCTTTG AGAATAACCA AGCAGATCTA GAGAATGCCA CAGAGGTGCT CTCGGGCTAC 1260 

55 

CTTGAAOGAG ATATTTCCCA AGATTCTCTG CAGGATATAA AGCAGAAAGT ACAAGACAAG 1320 

TACAGATACT GTGAGAGTCG ACGAAGGGrTT TTGTTACAGC ATGTGCATGA AGGCTATGAA 1380 

60 AAAGATCTGT GGGAGTACAT TGAGGACTGA GAATGGCCCT GCATAAAATG AACTCTGAAA 1440 



wo 98/54963 



PCTAJS98/11422 



368 



AdTEACCAT CTAGAGnGCT CATOCAATTA AAACAAAACA AACACAAACA AGGAGGCACT 1500 

AAGCCTATTC TGACACCACT GGTCTGTACT ACCAGAATTG TrTTGTTAAT GGAAAGTTTA 1560 

5 

ACTAAATTAT ATTGTAATAA AAAGGTAGAT AAACCATTGT ACAACAGTAT TCTAGGCCGC 1620 

CAACAAAAGT GTGACAGACA CACTAAAAGC CCTCCAACTT TAACTTGTAA CGTAGCTTCA 1680 

10 TTCTCAAAGC TGACTCCTTT TTTTTCTTTT TCCTTTTCCT GAGTCTAGTA CAGTTAAAAT 1740 

TTCAAACAGC TCCTTGACAC TGCTTTPCAT GTITCAAACCA GCCATTTTGT TGTACTTTGG 1800 

TAAAGGACCT CTTCCCCTTC CTCCCCTACA CATACAGATA CACCCACACA CAGACTGACT 1860 

15 

ciCTTTCTCT cataccxx:aa GGTCATGAGT GAATGATGCT TAGTTCCTTG TAAAGAAAAT 1920 

CTTOGGATOG OGAAAGGGGT AGGCAGCAAG AGGATTCAAC AAACGAAAAA CATAAAAACT 1980 

20 TTGTATATGA CTTTTAAAAC AAGAQGACAA CACAGTATTT TTCAAAATTG TATATAGCGC 2040 

ATATGCATGG ACAAAGCAAG CGTQGCACGT GTTTGCArAA TGTTTAATTA CAAAAAAATA 2100 

TTTATTCTTT AAAAATCTTC AAGATTAXGT CTATTTGCTG TGCATTTTCT TTCAGTTTGC 2160 

25 

TTATCTTTCC CX3GGTTGGGG TTOGGATAAA GGTGTOTCGG TTTAGCACCT CTGGAAGACC 2220 

TATCTflGAGC TCTTTCACTT TCCTGAGGTT ATTTTGCCCy TTCTGGTGTT GGTATGTCTG 2280 

30 TTGCOGGCCA TQGGCTNCAY GCCTTGAATT CXTPQCTCTTG ATCAGGGACA AGGGAGGTCA 2340 

AGCTCTGACT AATOCCATGA CCTGATTAAG GGGrTACAGCA GGGAGTTTTG TTGCTACAGC 2400 

TCATGAATTA ACCTGTCCCA ACCTAATCCC CCTCCATGGC ATCATGCXTTC TACCCAAGCC 2460 

35 

TTTGTGTGCC CATGTTATGC ACACAGCTGT AGGCMTCTT AAGTCCCCTG TCGCATCCAG 2520 

TGGAAGCATT TTAAAATTTC TTTTACTTTT TGGTTTTCCC TTAATTGCTG CTTTTCAGAT 2580 

40 TTTAGTTATG GCTCGTCTGC TCACXXXTTC TCTACArTAG GGTGTCAAAG AGAATGTTTT 2640 

GCTTTAAATA TAAflaCAGCCA TTCATTTAGT CTCAGATTGT GAATTTAAAA TGGTGGKTAC 2700 

CGAAATTGCT TCrGTGTGTT GCTOTOGGTr TGGTTTGAAG GCAAACACCC CTAGAACATG 2760 

45 

ATATTCCCAT CTfiGTOCATT TAAATAGAAA TCACTGAGTT TGCTGCTTTr TTATTGTCAG 2820 

CAGATAQGAG AATrCAATAAT GCATrTTAGC TGTGATCTCC ArTTTTTATGA AATTCCEACT 2880 

50 AAGAGCTATG TTAAAAGTEAA AOGATGGTQG TGGTTGTATT AACTATATAC CTGTTTAGGC 2940 

CATTCTGGCT GTGGTATTTT TCAATAGGTC AGCATCTCTA AATCTGTCAG TTTTATACAG 3000 

GAGTGCAGAG TGAACTAGGC AACTAGATTA AGAGGTCTAA ATATGAAATA (XAGTTGftQG 3060 

55 

CTGAQGACCT CTTCGTCTTC CTTTAAATGT CTTTTGCCTA GQGAGTGTTT ACCATTTGTG 3120 

AGGCAGCTTT GTCTGCTCTT ACACTGTACA TCCTATTACT CCATTGGGAA GTAGGTTCAC 3180 

60 TTTCCTCTGG CCTTTTGCX:T AAGTTAGGCT TTGCTGAATC AACCCTACTT TTCCTTTTAG 3240 
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AAAAQGTTCT TACAGGAGAT TTACTGGCAA CilTi ' inTri ' CCCATCAAAA ATCAGTGAAT 
GTTTGCtGAG TATAAATGCT GCTTCCTTAA ACCACTTGrTC GCTTTAGGAT CAACTTTACC 
TGTACCTTTT CTCCTTTCCT CXXTTTGCCAC CTCAGGTGCA AATCTGAACT GAGTGTCTGC 
TTCTTCCATT TTCTCGTCTC TCTCCCCTCT TCCCCCATTA TCCATATGAC ATTAITTTAC 
TTCAAATGAC AGCATCAATC TTAAAAAGAT ATACATTAAA ACTAAGGAGT TTTTTTAAAG 
AAAGCCTGAA TAAGrTCCTT TCCCTGGTAA CTTTGAAAAG CAGTCAGAGT TGCTATATAG 
ATATATGTGG CTCCTTTAAA ATGCTTTGTG TATGTffTQGrr GTTTAAAAAA AAAAAAAAAA 
TTCGGGGGGG GGCCCGGTNC CCAT 



(2) INFORMATION FOR SEQ ID NO: 116: 

(i) SBCSfOENCE CHARACTERISTICS: 

(A) LENGTH: 1965 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOIiOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116: 
AAGAAAGGGT ATTAAAATTC TAGATCACAT ATGGACOCGG GAAGGTTTTT NACOCTCTGT 
TAGTGACATC GAGTCTCCCA CTAGACAAAA TAGGTGGAAA AATCTCTCGA GGGCTCACAT 
TGTTTPGTCA TCTTCAGGAA AAACACCACC AGGCCATACC ACAGCCTGCC CAGTGAQGOG 
GTCTTTGCCA ACAGCACOGG GATGCTGGTG GTGGCCTTTG GGCTGCTGGT GCTCTACATC 
CTTCTGGCTT CATCTPGGAA GCGOCCAGAG CCQQGGATCC TGACCGACAG ACAGCCCCTG 
CTGCATGATG GGGAGTGAAG CAGCAGGAAG GGGCTCCCAA GAGCTCCTGG TQGTGCAGCC 
TGTGCTCCCC TCAGAAGCTC TGCTCTTCCC AGGGCTCCCG GCTGGTTTCA GCAGGCGACT 
TTCTTCCAAT GCTGGGCCCA GACTTCITGC CIGGGrTGCTG GCCTGCCCTC TCCGGNCCGC 
TTGCTGCCTG TCTGCTTTCC TTGGTGGYTT TGCTQGGTGC TGGGCCTGCC CTCTCCGGCC 
GCTTGCTGCC TOXTJ'ltXTi'T CCTTGGTQGC TTTGCTGGGT GCTGGGCCTG CCTTCTCTGG 
CTGCTTGCTC CCTGTCTGCT TTCCTTGGTG GCTTTOGCTT CTGCACTCCT TGGCGTCASC 
TCTCAGGTCC TCCATTCACA CGAQGTOCTC CTCGCTCTGG CCGCTCTTGC TGCTCCTGTC 
TGAAGAWATC AGACTGATTT CCTCTTAAGA CTCCTAGQGA TGTGGTGAAG AGCTGGGACT 
CAAGTGCAGT CCAOGGICTG AAACATGAGG GARGTGAGGT GTCCGTCCAC TTCCCCCATA 
AAGGTGTGCA TTPCAGTTAG GCTGCCCCGC CACAGAGCAG GCTTCATCTG CTCTGCCATC 
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CAGCCCCATC TGGAraTGAG GTGGGGTGGA GACATCATGG GGTGATTGCA GAAAGGGGGA 
GTGGCGGCCX: ACGCAGCTTC TGCTGAGGAG CTGACCGCTC TGAGCTGTTC TCTTTCGTAT 

tgctgctctg tgtctgcatg tattgtgacc gtgcxsgctcc acctcttcca gctgctgcta 
cagctgaggc ctggatcccg gcccttccct orgacttacg tgtctgtcac cggcangcag 
ccctacaaat cctggtgacc tgctctccca agaacagagc ctgtccccag atgtcccagt 
agcgaixsagt aacagaggtg gctotogact tcctctactt ctccttgctg gatcagggcc 
ttcctgcctc ccgctgggca ggtctggcct tgctctcttg gcagggcccc agcccctctg 
acx:actctgc agctcaccat gcagctgatg cxiaaagttgt ggtgtccagt gtgcagcagc 
cctgggftgcc actgccacct tcagaggggt tccttgctga gacccacatt gcttcacctg 
gccccaccat gqctgcttgc ctggcccaac ctagcgttct gtgccatgct agagcttgag 
ctgrrgctct tcttcagggg aggaaatagg gttggagagcg ggaagggtct tgctcctaag 
tgttgctgct gtggcttttt tgccttctcc aaagacgcac tgccaggtcc caagcttcag 
actgctgtgc ttagtaagca actgagaagc ctggggtttg gagcccacct actctctggc 
agcatcagca tcctactcct ggcaacatca ggccaacx3tc cacoccagcx: tcacattc5cc 
agatgttggc agaagggcta atattgaccg tcttgactgg ctggagcctt caaagccact 
gggatgtcct ccaggcacct gggtcccatg accagctccc cgtctccata ggggtaggca 
tttcactggt ttatgaagct cgagtttcat taaatatgtt aagaatcaaa gctgtctttg 
ttcaggctgc tataacaaaa atataatagc ctgggtqgct taaac 



(2) INPORMATION FOR SEQ ID NO: 1X7: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LEltGTH: 503 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEnNESS: dotible 

(D) TOPOIjOGY: linear 

(xi) SBQfUENCE DESCRIPTION: SEQ ID NO: 117: 

agtgatcccc ttgcctcggc ctcccaaaat gctqgaa3tc taagcgtggg cctctgcacc 
cggcctggtc cgcaatttaa aaacgcacag ccaccattcc ctytccagaa agcacccaga 
tgcctttggg agaaccagcc tcctccatqg agqaaagctt gggatctgcc ttcccacctg 

GQGAGGAGAG GGATCTGTGG AAAATCCrrC TGACGGACTT CCCCTCAGTG CCTGATCCAT 
ACTCAATAGT AGAAAAAGTA AGAAATATAC AAAGATAGCA GATACACGGA GACAGTTCCC 
CAAATAGCTG AGCGAWTAGC GCAGAAGCAA TATTGAAGAC CTAATAGCTG AGACATTTCC 
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AGAACTGATA AAGTGCATCC AGCCACAGAT CAAGCAGCCC AGAAAATTCC AGGCAGCATC 
AACAAATAAA TAGCXXXACA TGCACCCX3TG AAAATGCAGA AGACCAAACA AAAAAGTCCX5 
GTCAACAGCC AGAGTTAAAG AGG 

(2) INFORMATION FOR SEQ ID NO: 118: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1133 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY": linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 
GGCACAGCTT GGAATGAACC CCTGrTOGATA AGGGGGACTA TPAGATAGAA TAAACATCAA 
TAAATGCTTG ATGAATAAAC GCTAATCCTA CCTTCCCAGC CTGACACCTC CCAGTGGACA 
CCACACTTCA CTTGAAGCCT TAGAAACCTT TCCCACCCAT GCTTCCAGCC CTGGCTTCAT 
GTTGCCATTT CTCACCCCCA GAACAQGCCG CCCGCCTGAA GAAACTACAA GAGCAAGAGA 
AACAACAGAA AGTQGAGTTT CGTAAAAGGA TQGAGAAGGA GGTGTCflGAT TTCATTCAAG 
ACAGTGGGCA GATCAAGAAA AAGTTTCAGC CAATGAACAA GATOGAGAGG AGCATACTAC 
ATGATGTGGT GGAAGTGGCT GGCCTGACAT CCTTCTCCTT TGGGGAAGAT GATGACTGTC 
GCTATGTCAT GATCTTCAAA AAGGAGTTTG CACCCTCAGA TGAAGAGCTA GACTCTTACC 
GTCGTGGAGA GGAATGGGAC CCCCAGAAGG CTGAGGAGAA GCGGAAOTTG AAGGAQCTGG 
CCCAGAGGCA ANGAGGAGGA GGCAGCCCAG CAGQGGCCTG TGGTGGTGAG CCCTGCCAGC 
GACTftCAAGG ACAAGTACAG CCACCTCATC GGCAAGGGAG CAGCCAAAGA CGCAGCCCAC 
ATQCTACAGG CCAATAflGAC CTAOGGCTGT KPGCCCGTCG CCAATAAGAG GGACACACGC 
TCCATTGAAG AGGCTATGAA TGAGATCAGA GCCAAGAAGC GTCTGCGGCA GAGTGGGGAA 
GAGTTGCCGC CAACCTCCTA GGCGCCCCGC CCAGCTCCCT TTGACCCCTG GGGCAGGGCA 
GGGGGCAGGG AGAGACAAGG CTGCTGCTAT TAGAGCCCAT CCTQGAGCCC CACCTCTGAA 
CCACCTCCTA CCAGCTGTCC CTCAGGCTGG GGGAAAACAG GrPGTTTGATT TGTCACaSTT 
GGAGCTTGGA TATGTGCGTG GCATGTGTGT GTGTGTGTGA GAGTGTGAAT GCACAGGTGG 
GTATTTAATC TGTATTATTC OCCGTTCTT G GAATTTTCTT CCCATGGGGC TGGGGTACTT 
TACATTCAAT AAATACTGTT TAACCCAAAA AAAAAAAAAA AAAAGAAAGA AGN 
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(2) INFORMATION FOR SEQ ID NO: 119: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 1101 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOIjOGY: linear 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 

GGGCACftGCT GAAGCTGCAG ACCTCCCCAG GGGATGGCTC CTCTCCCCCA GGAGCCCCGA 60 

GGCAGGGGAG GCAGAAAGCC TGGGCTCTGG GGGGTGGCCT GCGGACAGCT GTGCTGTGGG 120 

1 5 

CCGGGGGCTG GQCCTGTCCC ACAGGGJ3CGT GGAGCTCGTC GTTCTGAGCA GCCAGCTCGG 180 

TGGTGTCTGG GGATAGCTGG GAGGCACAGC GGCTGCCATG TGGGACTGGG ACTGGAGTGC 240 

20 TCCCTC3GTCT TC3GCCTCTGT GGCTCAGCCT TGCTCTGGTC TGCCTGAGTG CAGGGGCCAA 300 

GGGGCACAGG GCCAGTGAGG CCGGCCACGC TCGGC3CCCTC ACCTGTGAGA TGGGGTCGGA 360 

ATTTKACACA GCCTANGGCT TQGTTCTTC3G TKGTNGAMCG TOGACTYCTK AGAACGGGAG 420 

25 

TGCTGGTCCT .GAAAGGCGTG GTTGGAGACC AGCTGCTTTT CTCG CXXJi ' IT TTCTCTTAQG 480 

AGATTAAACA AAAACAGAAA GCACAAGACG AACTCAGTAG CAGACCCCAG ACTCTCCCCT 540 

30 TGCCAGACGT GGTTCCAGAC GQQGAGACGC ACCIOGTCCA GAACGGGATT CAGCTGCTCA 600 

ACGGGCATGC GCCGGGGGCC GTCCCAAACC TCGCAGGGCT CCAGCAGGCC AACOGGCACC 660 

ADGGACTCCT GGGTGGCGCC CTGGCGAACT TGTTTGTGAT AGTTQGGTTT GCAGCCTTTG 720 

35 

CTTACACGGT CAAiGTAaSTG CTGAGGAGCA TCGOGCAGGA GTGAGGCCCA GGCGCCGAGA 780 

CCCAAGGCGC CACTGAGGGC ACCGCGCACC AGAGCGTGAC CTCGGCAGGC TGGACACACT 840 

40 GCCCAGCACA GGCAGACCCA CCAGGCTCCT AGGTTTAGCT TTTAAAAAOC TGAAAGGGGA 900 

AGCAAAAACC AAAATGrrGTC ACTGGGCTTT GGAGGAGACT GGAGCCTCAG CCCTGTCCTG 960 

GCCACGGGCC GCTGGGQCTG GTGTGGGTGG GCCTTGTGTG CTGGATTTGT AGCTTATCTT 1020 

45 

CCGTGTTCTrC TTTGGACCTG TTTTAGTAAA CCCGTTTTTC ATTTTAAAAA AAAAAAAAAA 1080 

AAACTTPGGG GGGGGGCCCC N 1101 

50 



(2) INFORMATION rOR SEQ ID NO: 120: 

55 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 282 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: double 

(D) TOPOLOGY: linear 



60 



10 



20 



25 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 

AGCTTCTCTG TCCAGTCTTG AACTCTGGGS TCTCTTGGAA CTTTCCTCAC CCCTCTCAGC 60 

CTGAATATTC CTTCCATGGA TTCCACTCAA CCAGACTTTG GATCTGTGCC TACTTAAOXIA 120 

ACCTTATCTT TGCAATATCTT TCGGGCCCAC CTTCCACPCC TTGGn?TCTTG TTCCTCCTTG 180 

GCCTAACTTG TCCCTTCTCC ACTTCACATC CCCGGTQGGA CAGCATTCCT CCTTCCTCCC ' 240 

iU^CTCCCTC CGTCTCARAA AAAAAAAAAA AAAAAAAAAA TT 282 



15 

(2) INFORMATION FOR SEQ ID NO: 121: 

(i) SEC^JENCE CHARACTERISTICS: 

(A) LENGTH: 2635 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : doiible 

(D) TOPOLOGY: linear 

(xi) SEQUEKK!E DESCRIPTION: SEQ ID NO: 121: 

TAAGGGGGTC TGTGCTCACC TCCTCCTGAC CCTTAACACT CCTGTCCTGC CCAGACCAAC 60 

AGAGAGAGCT GTCCCTGftGA CCCCGGAGAG AAGCAGCTGC CGAAAGCTGC AGCCTTTCCG 120 

30 CACTCTGAGA taVTGATCTT CCTCCK5CCA GGGGAGAGCC ACCCACAQGC CATGTCCAGC 180 

CCCACTTCCC TCAGCCCCCA GGGOTTCCTT CTGGCCCCrC TGAGGATTCC CTAGGGCTGC 240 

CCCGCAGAQG GGYTTCCCCA AGCTCTGTTT TGAAGCCTGC AATGTGGAAA AGTGAGAAGT 300 

35 

CAGAGGGAAC AGGACAGGTG CAGCCGQGCT CTGAGGCCAC ACCTCACACC TCGCTGTTCC 360 

CCAACATCCC CTGflGCAGTG TGAGCTCATC TCACCAGATG AGAAGAGGCC CTGTGCATTT 420 

40 YTTTTGTrTG TTTGTTC3CTG TTTTCCCCCA CCCATCCAGT TCTCCTCAGC AAAGCAAATT 480 

CCTTAACACC TTTGGTGGAG AATTTCTTAC CCAGACTTGG GGCTGTGATG CCCTTCAGTG 540 

CGTQGTGAGT GCAGCGTCTG TGCGTGriGCC TGTGTGTGAA CCTGGGGGCC ATCCTGGTGG 600 

45 

CCTGGGAGCG TGAGGAGAGG CCCCCTCTGT QCTGGGrrGAG TGGTQGGTGT GGGGTCAATG 660 

CflGTGAGGCT CTCTGGGTGA GQCTCCCAAC CTGGCAGTCC CCAGCCTCCC AGCftTCTGTG 720 

50 AGCGTCTGTT GGACTTTACA GAAGAGCCTC ATCCYGTCTG CCCCTCACTC TGCCCTQGAA 780 

TCAACATCTT CCX5AGTCCTT CTTGGGGGAA ATAGCAGAGC CCCACTTAAC TCCATAAACT 840 

GCTTCCCATT CCGCAGCCCA GTrCTGATTG TTGAGGTCTC GCGTCGTTCC AGGTCCCCCA 900 

55 

GTCCCCTCTT TCTCCTGTCC TCTCTCTGTC CTTCACCTCC CCACTCCAGC CCCGGCTCAG 960 

TTCAGGGAAA TQCTGTPCCA YATCAGCCCT CTGCTCTCIG AGGCAGCCGC GCCTCTGACT 1020 

60 CGGAGCTACT TGAAACTTCT GCTCTTGCTA GGATTGGAGT CTACCTATCT CTTOCATTTG 1080 
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TCCCAGCTGG AGTTCTGGAA CTTTCCTCCT CGGGGTGGGG GTGGGGTTTS TTAiiGGATGC 1140 

TGGGGCSGCCT GGGGAAGGAA GGAGTTCAGA GGA.=dGGG?ICT CCCC?S?rcr CTTX-.ICTC:-. 1200 

CCCTCCGCTC CTGGGACACG TGCTCTCTCT ^^C^CTG■3GT CTTCTGC-ZTG TC-CiCGTTTr^ 1260 

TGTGTCCrrG TAAATATGTT TTAGGAAGAA AGCiAA;^GGG CCTCTGCTAG 1320 

10 GATTQCAGGG GTCCAGCCTT GCCTGTTTCC GAAGCCCCCA CACTGCCTTT CGCTCCACTG 1380 

AGACPGGTCC CCPCAAAAGG TAGACAAAAC AGCAGCTCCC TGTGGA&CTG AAGSGCGGCC 1440 

TCAAAGTGGC TTrTTGTTAG ACAAGGTTAA GGTTTCCTCA TGAGCAAGC^T T^--JGA'ICGG 1500 

TCCTICCTCA GCTCCTTGAT TTGTGACCTT GACCVjSGGG CCTGCC;^rCC AGCCCCTCXA 1560 

GTGCCCTCTC CTCGATGCCT CGCTCCTTCC TGCCCCXZACT CCCCTGGrTT AGGCAGGTAG 1620 



15 



20 GGGAATTAGG GCCA3XXTCG AAGAAGCTTA ACCATGTGTT CAAAGAATSG ' i ' l ' .Vl ' rJ CT 



25 



55 



60 



(2) INPORMATION FOR SEQ ID NO: 122: 

(i) SEQUENCE CHARACTERESTICS : 

(A) LENGTH: 994 base pairs 



1680 



GCTTGGTCCT GGAACTCCCC TTGGCTGCCC CAGCXCTCCT TGGCCCATGG GTSCTGGGGG 1740 

AGGTQGATGT CAGATCTGGT AGGTTGCAGC AiSAGftAAATA AATGTGCTiTr GAGAGACCAC 1800 

TCAGAGAGGG TCCAAGGGTG ATGGAGAAGG AJiGCATGGCC TGGGAGCTTG GA-JS3GARGG 1860 

GTQGTOGGTG GCX3GCATCTT GACTGCCCCC TGTIGrCCCA CACGTGGGGG GrrGCrTCACCC 1920 

30 CYCTTCACTC CAGOCCGCCT GCCTTCAGCC ^TTOIArGAGC TTCACXTGCT TCCiJiJCTrCA 1980 

CTTTGGAGGG GGTGGGGTCC GTTGGCATCA AjCACGGGGAC CCTCTGCTTC AjCCAAAGCCC 2040 

GAGCCCTCAG CXXTCTGGGGA a:\ACAAATGG CTa^iGCOTTG ATACCTG3GG TC^CGAGAG 2100 

GCTGOGGGCT GQCGGCAGTC CCAGGGGAGA GACACCACAG AAG3AGA3CC A^^^VTCCCG 2160 

AGGAAGTTCC CAGCAGAGCA AACTGCITTC CAGCCTGAAG CCrTGCTZAAA CTGTGIXyi'rs 2220 

40 TGCAATAACT GAGCTTAGAG TTAGGAAaTG TGTTCAAGTG CTTGGATTZC CGTCICTAGA 2280 

TTTAACTGCT GAAATTGTAT CTCTCAGTAA TTTTAGArGT CTTTTAAAAA ATCGAAAAAC 2340 

AAAGTGTTAG ACTGTGTGCG TGrTGCGTTGA TGGGCACTCA AGAGTCCZ^ GAG^ATCCA 2400 

45 

CSCCXTTGCCTT TOXCTGCGC CCCCATCCTC TCACGTCCCG CCCTGCCTCC AC?IGGGGAC 2460 

CCTGCCTCGT GTCGTCTTTA TCTQCCTATT ACTCAGCCTA AGGAAACAAG TACACTCCAC 2520 

50 ACATGCATAA AGGAAATCAA ATGTTATTTT TAAGAAAATG GAAJ^ATAAAA ACTITATAAA 2580 

CACCAAAAAA AAAAAAAAAA. ACCCNGGQGG GGGCXTCGGTA ACCCACTTCG CCZAA 2635 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 
GAATTCGGCA GAGGTTCGGC GAAGATAGGG AATAAGGAAG CACAGGACTA GGGGAGAAGG 
AAGCACAGGA GTAGGGGAGA TATACAGCGG TCAGGATAAG QGGGAAAGGG OGGTC5GTTGC 
SCAAGAGGTG AAACAAGATG TGAGAGACAA GGGGTAGGGA AGAAATC3GGG CAGCGGTTAG 
GTTCAGAAGC GCATAGACCG TGGCGGACGG GCAATGCGAG GGGCACAGAA AGGAACTGAG 
GQGTGGGCTA TTTTAARGGA GATGGTCCTT CAGCCCTCTT YiriTCTGCG TAGTTCTCCT 
CCTCCAGGCC GCG0G(2GGAT ATGTCX3TCCG GAAACCAGCC CAGTCTAGGC TGGATGATGA 
CCXACCTCCT TCTACGCTGC TCAAAGACTA CCAGAATGTC CCTGGAATTG AGAAGGTTGA 
TGATGTCGTG AAAAGACTCT TGTCTTTGGA AATGGCCAAC AAGAAGGAGA TGCTAAAAAT 
CAAGCAAGAA CAGTrTA3CA AGAAGATTGT TGCAAACCCA GAGGACACCA GATCCCTGGA 
GGCTCGAATT ATTGCCTTGT CTGTCAAGAT CCGCAGTTAT GAAGAACACT TGGAGAAACA 
TCGAAAGGAC AAAGCCCACA AACGCTATCT GCTAATGAGC ATTGACCAGA GGAAAAAGAT 
GCTCAAAAAC CTCCGTAACA CCAACTATGA TGTCnTGAG AAGATATGCT GGGGGCTGGG 
AATTGAGTAC ACCTTCCCCC CTCTGTATTA CCGAAGAGCC CACCGCOGAT TCXSTGACCAA 
GAAGGCTCTG TGCATTCGGG TTTTCCAGGA GACTCAAAAG CTGAAGAAGC GAAGAAGAGC 
CTTAAAGGCT GCAQCAGCAG CCCAAAAACA AGCAAAGCGG AGGAACCCAG ACAGCCCTGC 
CAAAGCCATA CCAAAGACAC TCAAAGACAG CCAATAAATT CTGTTCAATC A3TTAAAAAA 
AAAAAAAAAA AAAAAAAAAA AAAAAOGGGA OGGG 



(2) INPORMATICN FOR SEQ ID NO: 123: 

(i) SEQUENCE CHARACTERISTICS: 

(A) IiENGTH: 1542 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear . 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 
GGCASAGCCA CCTCGGCCCC GGGCTCOGAA GCQGCTCGGG GGCGCCCTTT CGGTCAACAT 
CGTAGTCCAC CCCCTCCCCA TOCCCAGCCC CCGGGGATTC AGGCTCGCCA GCGCCCAGCC 
AGGGAGCCGG CCGGGAAGCG CGATGGGGGC CCCAGCCGCC TCGCTCCTGC TCCTGCTCCT 



GCTGTTCGCC TGCTQCTGGG CQCCCGGCGG GGCCAACCTC TCCCAGGAOG ACAGCCAGCC 
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20 



25 



30 



35 



40 



45 



CTGC5ACATCT GATGAAACAG TGGTGGCTGG TGGCACCX3TG GIX3CTCAAGT CSCCAAGTCAA 300 

AGATCACGAG GACTCATCCC TGCAATGGTC TTAACCCTGC TCAGCAGACT CTCTACTTTC 360 

GGGAGAAGAG AGCCCTTCGA GATAATOGAA TTCAGCTGGT TAMCTCTACG CCCCACGAGC 420 

TCAGCATCAG CATCAGCAAT GTGGCCCTGG CAGACGAGGG OGAGTACACC TGCTCAATCT 480 

TCACTATCCC TGTGCGAACT GCCAAGTCCC TCC3TCACTGT GCTAGGAATT CCACAGAAGC 540 

CCATCATCAC TGGTTA'EAAA TCTTCATTAC GGGAAAAAGA CACAGCXZACC CTAAACTGTC 600 

AGTCTTCTCG GAGCAAGCCT GCAGCCCX3GC TCACCTGGAG AAAGGGTGAC CAAGAACTCC 660 

ACGGAGAACC AACCXX3CATA CAGGAAGATC CXIAATGGTAA AACCTTCACT GTCAGCAGCT 720 

CGGTGACATT CXAGGTTACC CGGGAGGATG ATGGGGCGAG CATCGTGTGC TCTGTGAACC 780 

ATGAATCTCT AAAGGGAGCT GACAGATCCA CCTCTCAACG CATTGAAGTT TTATACACAC 840 

CAACTGOGAT GATTAGGCCA GACCCTCCCC ATCCTCGTCA GGGCCAGAAG CTGTTGCTAC 900 

ACTGTGAGGG TCGCGGCAAT CCAGTCCCCC AGCAGTACCT ATGGGAGAAG GAGGGCAGTG 960 

TGCCACCCCT GAAGATGACC CAGGAGAGTG CCCTGATCTT CCCTTTCCTC AACAAGAGTG 1020 

ACAGTGGCAC CTAOSGCTGC ACAGCCACCA GCAACATGGG CAGCTACAAG GCCTACTACA 1080 

CCrrCAATGT TAATGACCCX: ACTCCGGTGC CCTCCTCCTC CAGCACCTAC CACGCCATCA 1140 

TCGGTOQGAT CGTGGCTTTC ATTGTCTTCC TGCTGCTCAT CA3X3CTCATC TTCCTTGGCC 1200 

ACTACTT6AT CCX3GCACAAA GGAACCTACC TGACACATGA GGCAAAAGGC TCCGACGATG 1260 

CTCCAGACGC GGACACGGCC ATCATCAATG CAGAAGGCGG GCAGTCAGGA GGGGACGACA 1320 

AGAAGGAATA TTTCATCTAG AGGCGCCTGC CXZACTTCCTG CX3CCCCCCAG GGCCCTGTGG 1380 

GGACTTGCTG GGGCCGTCAC CAACCCGGAC TTGTACAGAG CAACCGCAGG GGCCXJSCCCT 1440 

CCrajrrGTT CCCCAGCCCA CXTACCCCCT TGTTACAGAA ' IVl ' VTKG ' ri ' i ' GGGGTGOGGT 1500 

TTTOIWATTG GTTTNQGATN GGGGAAGGGA GGGANGGCGG GG 1542 



50 



55 



60 



(2) INFQRMATim FOR SEQ ID NO: 124: 

(i) SBQUQTCE CHARACTERISTICS: 

(A) LEE«3TH: 1390 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

(xi) SECUENCE DESCRIPTION: SBQ ID NO: 124: 
CAAGCTCTAA TACGACTCAC TATAGGGAAA GCTQGTACGC CTGCAGGTAC CGGTCCGGAA 



60 



wo 98/54963 



377 



TTCCCGGGTC GACCCAOSCG TCOGGGCCTC AQGGTGGACG CATGGTTCTG CACTGAGGCC 
CTCGTCATGG TGGCGCCTGT GTGGTACTTG GTAGCGGCGG CTCTGCTAGT CGGCTTTATC 
CTCTTCCTGA CTCGCAGCCG GGGCCGGGCG GCATCAGCCG GCCAAGAGCC ACTGCACAAT 
GAGGAGCTGG CAGGAGCAGG CXGGGTGGCC CAGCCTOGGC CCCTGGAGCC TCAdGAGCCG 
AGAGCTGGAG GCAGGCXTTCG GCGCCGGAGG GACCTGGGCA GCCGCCTACA GGCCCAGCGT 
OGAGCCCAGC GGGTQGCCTG GGCAGAAGCA GATX3AGAACG AGGAGGAAGC TGTCATCCTA 
GCCCAGGAGG AGGAAGGrTGT CGAGAAGCCA GCGGAAAYTC ACCTGrrCGGG GAAAATTGGA 
GCTAAGAAAC TGCGGAANNT GGAGGAGAAA CAAGCGCXSAA AGGCXX:AGCK TGAQGCAGAG 
GAGGCTGAAC GTGARGWGCG GAAACGACTC GAGTCCCAGC GCGAATGAGT GGAAGAAGGA 
GGAGGAGCGG CTTCGCCTGG AGGAGGAGCA GAAGGAGGAG GAGGAGAGGA AGGCCCGCGA 
GGAGCAGGCC CAGCGGGAGC ATGAGGAGTA CCTGAAACTG AAGGAGGCCT TTGTCSGTGGA 
GGAGGAAGGC GTAGGAGAGA CCATGACTGA GGAACAGTCC CAGAGCTTCC TGACAGAGTT 
CATCAACTAC ATCAAGCAGT CCAAGGTDGT GCTCTTGGAA GACCTOGCTT CCXIAGGTGGG 
CCTACGCACT CAGGACACCA TAAATCGCAT CCAGGACCTG CTGGCTGAGG GGACTATAAC 
AGGTGTGATT GACGACCGGG GCAAGTTCAT CTACATAACC CCAGAGGAAC TGGCCGCCGT 
GGCCAACTTC ATCCGACAGC QGQGCCGGGT GTCCATCGCC GAGCTTGCCC AAGCCAGCAA 
CTCCCTCATC GCCTGGGGCC GGGAGTCCCC TGCCCAAGCC CCAGCCTGAC CCCAGTCCTT 
CCXrrCTTGGA CTCAGAGTTG GTCTGGCCTA CCTGGCTATA CATCTTCATC CCTCCCCACC 
ATCCTGGGGA AGTGATGGTG TGGQCAGGCA GTTATAGATT AAAGGCCTGT GAGTACTGCT 
GAGCTTGGTG TQGCTTGGTG TQGCAGAAGG CCTGGCCTAG GATCCTAGAT AAGCAGGTGA 
AATTTAGGCT TCAGAATATA TCCGAGAGGT GGGGAGGGTC CCTTGGAAGC TGGTGAAGTC 
CTGTTCTTAT TATGAATCCA TTCAITCAAG AAAATAGCCT GTTGCAAAAA AAAAAAAAAA 
AAAAACTCGA 

(2) INFORMATION FOR SEQ ID NO: 125: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1288 base pairs 

(B) TYPE: nucleic acid 

(C) STOANDEDNESS: double 

(D) TOPOIiOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: 
GGCGCGCQGG TGAAAGGCGC ATTGATGCAG CCTGCGGCGG CCTCGGAGCG CGGCGGASCA 
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GACGCTCACC ACGTTCCTCT CCTCGGTCTC CTCCGCXTCC AGCTCCGCGC TGCCCGGCAG 120 



CCGGGAGCCA TGCGACCCCA QGGCCXXX3CC GCCTCCCCGC AGCQGCTCCG CGC3CCTCCTG 



(2) INFORMATIC^J FOR SEQ ID NO: 126: 



180 



CrocTCCTCC TGCrcCAGCT GCCCGCGCCG TCGAGCGCCT CTGAGATCCC CAAGGGGAAG 240 

CAAAAGGCGC ATCCGGCAGA GGGftGGTCSGT GGACCTGTAT AATGGAATGT GCTTACAAGG 300 

10 GCCAGCAGGA GTCCCTCGTC GftGAOSGGAG CCCTGGGGCC AATQGCATTC CGGGTACACC 360 

TOQGftTCCCA GC?rCGGGATG GATTCAAAGG AGAAAAGGGG GAATGTCTGA GGGAAAGCTT 420 
TCAGGAGTCC TGGACACCCA ACTACAAGCA GTGTTCATGG AGITCATTGA ATTATGGCAT 



480 



AGATCTTGQG AAAATTGCGG AGTC3TACATT TACAAAGATG CGTTCAAATA GrTCCTCTAAG 540 



600 



AOTnTCTTC AICTGGCTCAC TTCGGCTAAA ATGCAGAAAT GCATGCTGrTC AGCGTTGGTA 

20 TTTCACATTC AATGGAGCTG AATGTTCAGG ACCTCTTCCC ATTGAAGCTA TAATTTATTT 660 

GGACCAAGGA AGCCXTTGAAA TGAATTCAAC AATTAATATT CATCGCACTT CTTCTGK3GA 720 

AGGACTTTCT GAAGGAATTG GTGCTGGATT AGTGGATGTT GCTATCTGGG TTGGCACTTG 780 

25 

TTCAGATTAC CCAAAAGGAG ATGCTrCTAC TGGATQGAAT TCAGrTTTCTC GCATCATTAT 840 

TCAAGAACTA CCAAAATAAA TGCTTTAATT TTCflTTTGCT ACCTCmTT TTATTATGCC 900 

30 TIGGAATCCT TCACTTAAAT GACaTTTTAA ATAAGTTTAT GTATACATCT GAATGAAAfiG 960 

CAAAGCTAAA TATGrTTTACA GACCAAftGTG TGATTTCACA TGTTTTTAAA TCTi\GCATTA 1020 

TTCATTITCC TTCAATCAAA AGTGGTTTCA ATATTTTTTT TAGTTGGrrTA GAATACTTTC 1080 

35 

TTCATAGTCA CATTCICTCA ACCTATAATT TGGGAATAIT GTTGrrGGTCT TTTGTTTTrT 1140 

CTCTTAGTAT AGCATTTTTA AAAAAATATA AAAGCTACxilA ATCTTTGTAC AATTTGTAAA 1200 

40 TOTTAAGAAT TTTTTTTATA TCTGrTTAAAT AAAAATTATT TCCMACAACC TTAAAAAAAA 1260 

AAAAAAAAAA AAAAAAAAAA AAAAANAA ^288 



(i) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH: 1517 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLCXSnf: linear 

55 (xi) SEQUENCE DESCRIPTION: SBQ ID NO: 126: 

AOrGGCTTAA AGGCATCGTT TEAGGGATTA CTGQGAAGTA TCTTCAAAGT AATACATGAG 60 
AAACATTCCT TCCTAAATCC TTTATTATAT TGAATATCGT ATTAATTGGT TTTCAGAGGT 120 

60 
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50 



(2) INFORMATION FOR SEQ ID NO: 127 



(i) SEQUENCE CHARACTERISTICS: 
55 (A) LENGTH: 1073 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECNESS: double 

(D) TOPOLOGY: linear 

60 (xi) SEQUENCE DESCRIPTICWI : SEQ ID NO: 127 



379 



TAAATTAACC ATGTATTCCT GCAATAAATG TCACITGTNT CTTGTATATA ATCTTTTTTA 180 

TATATTACCG GAITCATTCA TTAC3TATTTT GriTGAGGATT TrTGTGTCTA TATTCATAAG 240 

AGATCCTGGT CTCCAGITTT CTTTTTTTGT GATAATCTGG TTTTTCTATC AGTAATACAG 300 

GCCCCATCAA ACGAGTTCGG AAGTGTTCAC CrCTCTTGTA TTTTTTCAAG AGTTTGTGAA 360 
GAAITCCTAT TAATTCTTTA AATGTTTGC?r AGAATCTACC ATTGAAATCA TGTGTCCTGG - 420 

GCTTTTTTTT GAGGGAAGTG TTCTGATAAC TAATTCAGTA TCTACTTriT ATAGCTCTCT 480 

TCAGATTTTG CTTCTTCCTG AGTTAGTTTT GGTAATTTGT GTATCTCTAG GARTTTGTCC 540 

15 ATTTCATTTA TCTCATTTGT TOCXATAAAT TAAACTAAAT TTGGCCTGAG CCTACCTGTA 600 

TATCTTGAGT CCCTCTCTAA GGAACTGTAG CCTAACTTGT ACATAAACAA ACTGAAATCC 660 

TAAATTAGGA ATCTAGTnT TGTAACAGCT CCTGAGTCTC AGGCAGTCAC AGCAGYCAAG 720 

20 

TCK3TCAATT GCAGGCTGCT AACTAAGCAG CCCATGSTCA AATGAGGCAA AAACCTTTGC 780 

TTTTAACACA TAGTATAGCT TTGTAATCCT TTTCTTGCAC ACTCGGGTAA TTTCTTCCTT 840 

25 TITCATrCCC KGWATTTTCC AKGAATATGA RTCTYCCTTT TTTCCCCTCC TGTCAGTCTA 900 

GCTAATGGTT TOTCAATTTT GTTGATCTTT TGAARAACAA ACCTTTOGrTT CCACTTTCTT 960 

GTTGCATATG CTGAFTATTC TCATAATTGG AGTGGAAAGC TGATCTTTGA TTACTTATTT 1020 

30 

TACTTAGGGC TCAQGAGTTC ATGGACTTCG CAAAACCTCC TTGAATCTAA ATTGCATCTT 1080 

CITTCCTCGT TTCTGGGCTG AAACATGTTT TTTCCCATCT WANAWACCCT TGGTCTTTTC 1140 

35 ATKQGCGATT AAGACTAGAG AAAGTTCTAG ATMCCTTGTC CTTTTATGCT GTCATTTTGT 1200 

TTAAAGGCTT TCTATGTAGT AAAACTATCT ATATAGACAA AATAGAGCCT TGAGTTGTOG 1260 

TCTTCAATTT GATCAACATG ATTTACCACA TTCTGTACTG GATATTTCTT CACCTGCTCC 1320 

40 

TACTGTAAAC CATTTTATTC TTGGATCTTC TGTAGAGTAT ATTATCACAG C3TACTTTTTA 1380 

CAGGGCnXSTC TAATCTTTTG GCTTCCCTGG GCACATTGAA AGAAGAAGAA TTGTCTTGGG 1440 

45 CCACACATCA AATACGCTAA CACTAATAAT AGTTGATGAG CTAAAAAAAA AAAAAAAAAG 1500 

GCAAAAAAGN CCCAAAA ^^^^ 



1 
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TGAATCTATT CTTTGAACAT TCTACAACAA GAATTACATT ATACTGTTAT ACCAGAGTAC 60 

TTCTCCACTTG TGAAATAGAT TGGTTTGGAA AATGAACCTG GCTITGCTAT AAATTACATT 120 

5 

C^JCAGGCCTT TITGCAAATG TGTAACTTGC CTATCAAAlGT AGmTGTAGG GCAAATGCAG 180 

AATATATGTC TCCATCTGGT AAAGTACCTT WTAYTCATGT GGGAAATCAA GTAGTATCAG 240 

10 AACTTCGTCC AATAGTCCAA TTTGTTAAAG CCAAGGGCCA TTCTCTTAGT GATGGGCTGG 300 

AGGAAGTCCA AAAAGCAGAA ATGAAAGCTT ACATGGAATT AGTCAACAAT ATGCTGTTGA 360 

CTCCAGAGCT GTATCTTCAG TGGTGTGATG AAGCTACAGT AGGO(MGATC ACTCATCM*A 420 

15 

GGTATGGWrC TCCTTACCCT TGGCCTCTGW WTCATATTTT GGCCTATCAA AAACAGTGGG 480 

AAGTCAAACG TAAG^3TGAAA GCTATTGGAT GGGGAAAGAA GACTCTQGAC CAGGTCTTAG 540 

20 AGSATOTAGA CCAGTGCTGT CAAGCTCTCT CTCAAAGACT GGGAACACAA CCGTATTTCT 600 

TCAATAAGCA GCCTACTGAA CTTGACGCAC TQGTATTTGG CCATCTATAC ACCATTCTTA 660 

CXIACACAATT GACAAATGM GAACTTTCTG AGAAGGTGAA AAACTATAGC AACCTCCTTG 720 

25 

CTTTCTGTAG GAGAATTGAA CAGCAdATT TTGAAGATCG TQGTAAAGGC AGGCTGTCAT 780 

AGAGTTATCT GTTAGTCTCA GGAGTCTTAA CTTTTGAAAT ATGTTTTACT TGAATGTTAC 840 

30 ATTAGATATT GGTGTCAGAA TTTTAAAACC AAATTACTGC TTTTTGAAAC CTCAAATTAT 900 

ATAATGTATC TTAICTATGT GCTTTATATT GTTATTTGTG TATACATTAA AATAATTCTG 960 

AATTATTTAA TCTGATATGT TGTATTCTGT ATCTTGAAAT TTTTGTTTCC TTGAAACATG 1020 

35 

CATGCATTTA AAAATAAAGC TTAAACAACT GTAAAAAAAA AAAAAAAAAA CTC 1073 



40 

(2) INFORMATION FOR SEQ ID NO: 128: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 300 base pairs 
45 (B) TYPE: nucleic acid 

(C) STRANEEDNESS: doublfe 

(D) TOPOLOGY: linear 

(xi) SEQDENCE DESCRIPTION: SBQ ID NO: 128: 

50 

CAACCCCTGC ClTriTVni; TTTTCCATTT GCTTGGTAGA TCTTCCTCCA TCCCTTTATT 60 
TTGAGCCTAT GTGTGTCTCT GCCCGT6AGA TGAGTCTCCT GAATACAGCA CACTTACTGG 120 
55 TCTTGACTCT GTATCCAATT TGCCAGTCTG TGTCTTTCAT TTGGAGCATT TAGCCCATTT 180 
ACATTTAAGG TBCAATATTGT TATGTGTGAA TTTRATCYTR TCATTATGWT GTTAGCTGGT 240 
TATTTTGCTT GTTAGTTGAT GCAGTTTCTT CCNGGCATCA ATGGTCTTTA CAANTTGGCA 300 

60 
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(2) INFORMATION FOR SEQ ID NO: 129: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1275 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: double 
10 (D) TOPOIOGY: linear 

(xi) SEQUENCE DESCRIPTICftJ: SEQ ID ^K): 129: 

GGCAGAGCCT GTCCCTGCTG CCCCTGCAAA AAAAACCCCC TCTGC3TGTGA GCAGGATGGT 60 

15 

TOGAGGTTAT GTGAGCTCCT TCTCCTTTCC TCCAGTTTCC TCTTCCCTTC TCCTCCCTGC 120 

CTCTTTTGCT TTTCCCTTTC TTCCTGGTAC CCCCTQCCCA TTCCTGTATT TTCTCCCATC 180 

20 GCCATTCTCC CdCTCCCAC TGTCCCTAAC CCGTTCAAAC TCTrTCCTCT TAAATGGTTG 240 

AGATTTTCTC TCACCAAGCA CACCCCAGTA TTAATTAAAC TAGCTGCAAA CAGGCAGCAA 300 

GTGGTCTACC ATGACAGATG GGTTTTGTGT GTOPGTGTGT CTTGTGTAATr GTAATAAAAC 360 

25 

ATATTGARTC ACTCAATAAA CACAGAGTGT CTACTACATG TATCARGCAC TATCATAGAT 420 

GCTAATTAAC GAAACTGAAA TGGCCAGGCC CTCACAGrGG CTCATGCCTA TAATCCCAGC 480 

30 ACTTTOGGAG GATGAGGCAG GAGGATCACT TCAQGCCGGG AGTTCAAGAC CAGCCTGGGC 540 

AACATAGTAA GACTCCATCT CTACAAAAAA AAAATTTTTT TTATTATACT ITAAGTTTTG 600 

GGTTACATGT GCAGAAOGTG TAGTTTTGTT ACATAGGTAT ATACGTGCCC TGGTAGTTTG 660 

35 

CTCCACCCAT CAACCCATCA CCTACATTAG GTATTTCTCC TAATGTTACC CCTCTCCTAG 720 

CCCCCCACCC CCTGACAGGC CCTQGTGTGT GATGTTCCCC TCCCTGTGTC CATGTGTTCT 780 

40 CATTGGTCAA CTCTCACCTA TGGAGTGAGA ACATGTGGTA TTTGGTTTTC TGATCTTGTG 840 

ATAGCTTGCT GAGAATGTKG GTTTCCAGCT TTATCCACGT CCCTGCAAAG GGCATAAACT 900 

CATCCCTTTT TATGGCTGCA TAGTGTTCCA TQGTGTATAC GTCCCACATT TTCTTAATCT 960 

45 

ATCATTGATG GACAAGTTTT GCTATTGTGA ATAGTGCCAC AATAAACATA CJGTGTGOGrrG 1020 

TOPCTTTATA GCAGCATGAT TTATAATCCT TTGGGTATAT ACCCAGTAAT GGGATCACTG 1080 

50 AGPCAAATGG TA3TICTCGT TCTAGATCCG TAAGGAATTG CCACACTGTC TTCCACAATG 1140 

TTTCAACTAA TOTACACTCC CACCAACAGT GTAAAAGTGT TTCTATTTTT CCACAACCTC 1200 

TCCAACATCT GTTAarTTCCT GACTTTTTAA TGAACGTCAT TCTAACTGGC GTGAGATGGT 1260 
ATCTCATTGT GGTTT 



55 



1275 



60 
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(2) INFORMATION FOR SEQ ID NO: 130: 

(i) SEQXJENCE CHARACTERISTICS: 

(A) LENGTH: 472 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGIY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 130: 

OKSAAACCCC CTTCAACCCTC CCCGGGTTAA AAAGCCCCCC CTAAATQGGG GGAACGCYTC 60 

ACACCTTTATA AAAAAGCACT AGAATGTTTT GAAAGCGAGA AACAACAGCT GrTCTAGQGTA 120 

15 GCTAGCAGTT AGTGTTCTAC AGAAGACAGA TATTTGTGCA TTTYTGCATT TTCTAAGnT 180 

GCTGCAATGA GCATGTATTA CTTTCATAGT TATAAAACAC ATGCAAAATG CCCTTTTAAA 240 

ATCAAAAAAA ATXXATGAGT GTAAGTGATA TATATGCTTT GGAAAGCCTG GGACGGTCAT 300 

TGriTACTCT CAATAGTATG TGTTTGCCTT TGTCTTTTTG AGACATTTTG TTTTAATCTG 360 

TTCATCACAA TAACCTGnTC ATAATATAAC TTGATAACAA ATAAAATGAC TTATGATTGA 420 

25 AWMAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA NN 472 



20 



30 (2) INFORMATION FOR SEQ ID NO; 131: 

(i) SB^^JENCE CHARACTERISTICS: 

(A) LENGTH: 1950 base pairs 

(B) TYPE: nucleic acid 
35 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 131: 

40 ACCTCrCAGA ATCTTCTCTC AGCAAOCTGA GTCITCGCCG TTCCTCAGAG CGCCTCflGTG 60 

ACACCCCTGG ATCCTTCCAG TCACCTTCCC TGGAAATTCT GCTGTCCAGC TGCTCCCTGT 120 

GCCGTX3CCTC TNATTCGCTG GTGTATGATG AGGAAATCAT QGCTQGCTQG GCACCTGATG 180 

ACTCTAACCT CAACACAACC TGCCCCTTCT GCGCCTGCCC CmOTGCCC CTGCTCAGTG 240 

TCCAGACCNT TGATICCCGG COCAGTGTCC CCAGCCCCAA ATCTGCTGGT GCCAGTGGCA 300 

50 GCAAAGKTOC TCCTGTCCCT GGrTGGTCCTG GCCCTGTGCT CAGTIGACCGA AGCTCTGCCT 360 

TGCTCIX3GAT GAGCCCCAGC TCTGCAACGG GCACATGGQG GGAGCCTCCC GGCGGGTTGA 420 

GAGTGGGGCA TCGGCATACC TGAGCCCCCT GGTGCTGCGT AAGGAGCTGG ACrrCGCTGGT 480 

AGAGAACGAG GGCAGTCAGG TGCTGGCGTT GCCTGAACTG CCCTCTGCCC ACCCCATCAT 540 

CirCTCGAAC CTTTTCIGGT ArTTCCAACG GCTACGNCTG CCCAGTATTC TACCAGGCtT 600 

60 GGTGCTGGCC TCCTCTGATG GGCCTTOGMA CTCCCAGGCC CCATCTCCTT GGCTAACCCC 660 



45 
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20 



25 



30 



35 



40 



TCATCTAGCC TCICTTCAGG TACGGCTGCT GTGGGATGTA CTGACCCCTG ACCCCAATAG 720 

CTCCCCACCT CTCTATCTGC TCTOGAGGGT CCACAGCXAG ATCCCTCAGC GGGTGGTATG 780 

GCCAGGCCCT GTACCTGCAT CCCTTAGrTTT GGCACTGTTG GAGTCAGTGC TGCGCCATCT 840 

TGGACTCAAT GAAGTTGCACA AGGCTGTGGG GCTCCTGCTG GAAACTCTAG GGCCCCCACC 900 

GACTGGCCTG CACCTCCAGA GGGGAATCTA CCGTGAGATA TTATTCCTGA CAATGGCTGC 960 

TCTGQGCAAG GACCACC3TGG ACATAGTCGC CTTCGATAAG AAGTACAAGT CTGCCTTTAA 1020 

CAAGCTGGCC AGCAGCATGG GCAAGGAGGA GCTGAGGCAC CGGCC3GGCGC AGATGCCCAC 1080 

TOXAAGGCr ATTGACTGCC GAAAATGTTT TGGAGCACCT CXIAGAATGCT AGAGACCTEA 1140 

AGCTTCCCTC TCCAGCCTAG GGrTGQGGAAG TGAGGAAGAA GGGATTCTAG AGTTAAACTG 1200 

CTTCCCTGTT GCCTTCATGG AGTTGGGAAC AGQCTGQGAA GGATGCCCAG TCAAAQGCTC 1260 

CAAGCGAGGA CAACAGGAAG AGGGATCCAC TGTTACCAAA AGTCCTGATT CCCCCATCAC 1320 

CAACCTACCC AGrrTTGTTCG TGCTGATGTT GGGGGAGATC TGGGGQGAGT TGGTACAGCT 1380 

CTOTTCTTCC CTTGrrCCTAT ACXX3GGAACT CCCXTCCAGG GTACCCACAG ATCTGCATTG 1440 

CCCTGGTCAT TTTAGAAGTT TTTGTTTTAA AAAACAACTG GAAAGATGCA GAGCTACTGA 1500 

GCCTTTGCCC TGAATGGGAG GTAGGGATGT CATTCTCCAC CAATAATGGT CCCTCTTCCC 1560 

TGACGTTGCT GAAGGAGCCC AAGGCTCTCC ATGCCTTrCT ACCTAAGTOT TICTATTTTA 1620 

TTTTAAATTA TETATTCTGG AGCCACAGCC CCCTTGCTTA TGAGGTTCTT ATGGAGAGTG 1680 

AGAAAGQGAA GGGAAATAGG GCACCATGGT CCGGTGGTTT GTAGTTCCTT CAAAGTCAGG 1740 

CACTOGGAGC TAGAGGAGTC TCAAGCTCCC CTTAGGAAGA ACTGGTGCCC CCTCCAGTXX: 1800 

TAATTTTTCr TGCCTGCCCC GCCTTGGGGA ATGCCTCACC CACCCAGGTC CTGACCTGrTG 1860 

CAATAAGGAT TGTTCCCTGC GAAGnTTTGrT TCGATOTAAA TATAGTAAAA GCTGCTTCTG 1920 

TCTTTTTCAA AANAAAAAAA AAAAAAAACT 1950 



45 



50 



55 



60 



(2) INF0RMATIC3N FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTO: 990 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPnON: SBQ ID NO: 132: 
TOGAAGATTT AAAATAGGTT TCATATTTCT CTTGAATATG AATATATAAG CTTGAATAAG 



60 
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rOAGTCCr TATTATTP-.TG AA-.TTTICCr TATTATTTCT ACCAATGCTT CTTATATTAA 120 

AGCCTGArcr TTnCATATT Aii^TATATCTA CATTAGCTGC CTGTGGATTA ACATTTCCAT 180 

a?iA-.ix7rATr TTr3C--rrGT ttgatcttaa AcrrTrrGTG tctttatata aggtatgcty 240 

CrrrTAA^C-. TGATArTTTT AA-CACAATA GTTGAAAGAC AATCTYCACC TTTTACTTGrT 300 

A'T^.nTACAT GrTAAirriAAT TTTTGATGCA TATTACGTCT TATTATTTAA CCAACCTATT 360 

TTATITTArC TAGGGCATTT TTCiySAAAGC CTTATTTTCT TGTATTAATC AAATATTTTT 420 

AYC-.ITCrAr TTTCC^rTAT TAOTAC2CAA TACXaeTACYC YAAATATATA TTGrGGSTAT 480 

15 TTTC.:^GAAT? GCAATAIGCC TCCTTAATTT ATTAGAGGCT AACCTAAATT ATTACTTTTA 540 

CCACTTACTT GAAAATCCTC GAACTTTAGA ACATTTATTG TTTTATGCAT TTTAATTCTA 600 

CTrrrrA'^T*!"! -PACTJCr^XTT AiiCATTATT ArTGTTTTAG ACAAGCCAAA ATATATNTTG 660 

20 

TTATTAirCTT AT/CTCCATT TCrTTCTGTA TTTTTATGCC ACTATGTATG CTCAATTTCC 720 

TICrATG?GA TG«ACCTAAT TCAGTACTTT TGTTTTTTAA TCTGrTGCAGG TAGCCTC3GCC 780 

25 ATTAAATITT TA IlTriVG OT TTGCTGAAAA AATTGTGrTTT ATTTCTATAT GCATACTTAT 840 

GCArATASAA T>rTAG7r^?G ACATATmT AGTATTTATA AATGTAAAGT CaTT5(4ATTKG 900 

GCTTCTArCA TTTCKGI^OGA a^AATCAATT GTrCAGCCCAA TAGTTTTTCA TTTTAAATTA 960 

CNa-J Vi ' - ' i- ' ri rcATGicrcT gc/jittagga 990 



30 



35 



45 



(2) UirOFI'GTICN rCR SEQ ID 133: 



(i) SB^UEirE CHAPACTERISTICS: 

(Aj LH2!JGrr-:: 1720 base pairs 
40 (3) TYPE: nucleic acid 

(C) STRAtilSCNESS: double 

(D) TOPOLCGY: linear 



(xi) SEQUSJCS DESCRIPTION: SEQ ID NO: 133: 

CnCTGACrAAG CGACICTGGT TATTCCCCTA AAGTrTACTT CAGCACTAAC ACTAGTCCTT 60 

CCGCTGGAGT TTGCfi£?7ITr CCAGCTTTAT ACAGGATTTT CCTTTGACTG GAAGAGTCAA 120 

50 GGATATAGAG ACTCAACAGT a^^AOTTATT GTACAACATC AAGGGGAATA GGATACTCAT 180 

CAAAdGGGA TTATTCITAT CAAAACATGG TCTTCTTTGA ATAAGAAAAA TACATAGTTG 240 

GTT^TTATCG ACTTAAAACT GTCTTAAATG GATATTCPGA TAAAATATTT GCTGCTCTGT 300 

55 

AGACTKnCGA AAATCTGAGA A7ATTAGCTT TACTCATCTT GAGCTTTGAG GATCTTCTCT 360 

OTACGCCGAT QGriTTCATAT TAACTAAAAA AGCTGGGTAT TGTAAAATCT CATrTATAAA 420 

60 AAC7CAGAT3 AGAAGAAAAT TITCTTTGAT GGTGAGACTG TTGTCTTAGT TCAGGAAATT 480 
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ATTTAATAAT CCTTTGTTAC CTGTGAATGA AGGAACTTTG TAATPCTGAT TTATCCTAAA 540 

ACATGAGCCT TTCCAGAGTC AGCTTAGACA CTGTTGTCGC AAATAGCCAT GCTTTGCCTT 600 

5 

ATGCCAAGGA GGCCCAGAGG GAGGGCCTAG TCTTCCTCTG TTGCTGTACA TATATTGAAA 660 

TGcrnrnr TrrrATTTTG catttgttat ctataatgag ctttctgagc cctgatatta 720 

10 tgtcagacaa ACAGGAGTTA TTGATGTTAT ACACTCCCTT OCATTCAGGA TTTTCTGCTT 780 

GGAGGGAAAT ATGnTGACCT TAGAGAATTG TGAATATTGT TGCAATTCTT GAATATATTA 840 

CCATCTGAAT AATAGAGACT GTGTIX3CTCT CTAGTATAAG CTATATTTAT TTTTGATTCA 900 

15 

TnGAATTAC TAGTTATAAC TQGAGAAATT TTOTTACCTC TATCCTGGCT TGCCTGACTG 960 

GCICTATAAT AGCAGCAGCC TCnTTAGAG CATCTTAATG AAAACATGGA TGAAAGGAAT 1020 

20 TAATGATGAT ATCTGCAGAC TGCCTAGAAA ATGGCTTTPG TTCCCAGCGT TAACATTTTC 1080 

TTCTCAATTCA CATTTCAATG TTTGTGGAGA GTGGCAGATT CACACCAGAA ACACTAGGTG 1140 

TICATATCCA TAGCATGGAT GCAGAATAAG CAGTTGGGAG AGAAGCTTCT TCCTACCTGG 1200 

25 

TACTCCTCCC ATTCACCTCA GCCCAGCXXX: AGACAGGCGT TAGCATTCAG TGTQGGCCCT 1260 

CAGGCAGCCC TGAftGCCTGG CTGGGTCATC AGATGGGGGC AOXTGTGAC GGGCACXIAGC 1320 

30 GGCXTTGATTC CAGGGAAGAG TTCCTGGftGG GrrGTTGGCTG TTTTTGTTAG CTCAGTTTTT 1380 

TTCTGGGCPC CACCATTCCT AACTCCAGGT AGACAAGATA GATGTCACAC ACAACAATTT 1440 

TAAAGTATTT TGCTTAGTGC ATTTTGnTA TGATPGCAGT GTTTGTTTCT TATTTAATAG 1500 

35 

GCTTTTTACT TGATTCTATT AAATTTTAGT GrTTTAGAAGA GGCGGGTACT GTCACTGPGT 1560 

AAAATATOTA ATATTTTATA TGTTATACCA TGTCATATAT ACTTGCAATA TCAGACCTTG 1620 

40 CATICAATAT ACAAIGCAAT TGACTCTTTC CAGACCTGCA TTTTTCAGTG AACAATAAAA 1680 

AGATTGTCTG GCACTCCAAA AAAAAAAAAA AAAAAAAAAA 1720 



45 

(2) INFORMATION FOR SEQ ID NO: 134: 

(i) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTTH: 705 base pcdrs 

(B) TYPE: nucleic acid 

(C) STRANIIEENESS: double 

(D) TOPOLOGY: linear 

55 (xi) SEQUENCE DESCRIPTICN: SEQ ID NO: 134: 

GGCACGAGGC CATCTCGGCT CATTCAGCAG GAAATAATGG AAAAAGCTGC AATATCCAGG 60 
TGTTTACTAC AATCTGGAGG CAAGATCTTT CCTCAGTATG TGCTGATGTT TGGGTTGCTT 120 

60 
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(2) INFORMATION FOR SEQ ID NO: 135 



(i) SEQUENCE CHARACTERISTICS: 

(A) IiQiGTH: 323 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 

30 (D) TOPOLOGY: linear 



180 



GIGGAATCAC AGACACTCCT AGAGGAGAAT GCTGTTCAAG GAACAGAACG TACTCTTQGA 

TTAAATATAG CACCTTTTAT TAACCAGrTT CAGGTACCTA TACGrrGTATT TrTGGACCTA 240 

TCCTCATTGC CCTGTATACC TTTAAGCAAG CCAGTGGAAC TCTTAAGACT AGATTTAATG 300 

ACTCCGTATT TGAACACCTC TAACAGAGAA GTAAAGGTAT ACGTTTGTNA AATCTGGGAA 360 
GAdTGACTG CTATTCCATT TTGQGTATCA TATGTACCTT GATGAAGANG ATTAGGTPGG * 420 

GATACITCAA GTGAAGCCTC CCACTGGAAA CAAGCTGCAG TTCTrTTAGA TAATCCCATC 480 

CAGGTTGAAA TGGGAGAGGA ACTTGTACTC AGCATTCAGC ATCACAAAAG CAATGTCAGC 540 

15 ATCACAGTAA AGCAATCAAG AGCAGTTTTC CAATGAAAAC TGTGTAAATA GAGCATCAAC 600 

AACTTACAAAA TTCTTGTCTT AATTAGTGGG GGTATATAAA AATTCCTTGT AATGGTCAAA 660 

TATTTTTTAA AATTGACATT AATAAAGCAT ATTTTAAAAG TTTCT 705 

20 



(xi) SBtyjQJCE DESCRIPTION: SEQ ID NO: 135: 

AGCACACACC TCCTTTAGTT GCTCCTAAGG TCATGTTCAA CATTCGTGGA GTGCATTTTC 60 

TQCTCAGGGA GCTTTCCCAG ACCCGGAATG TTrGGTGCTC ACAGACYCTG GCAAGGATCG 120 

GTATIGCTGT TCCTCAGrTTT TGCCTGGGGA AATGGAGGST CAGTGACGTT CAGTGAC3GTG 180 

40 CCCAGAGTCA 1GCCATTGGC GGGTGGCCCA GEOGMTCCAGG TCTCCAGCAC CCCTCGGCCC 240 
CCrCCTCACC AGGTCACATC ATCTCCTGGA TTAGAATCTG CTCACATAGT CTGTCCTGAA 



300 



45 



AGGAAAAAAA AAAAAAAAAA AAC 



50 



(2) INFORMATION FOR SEQ ID NO: 136; 



(i) SEQUENCE CHARACTEEaSTICS : 

(A) LENGTH: 582 base pairs 

(B) TYPE: nucleic acid 

. (C) STRANDEENESS : doiible 
55 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136: 

GGACGGAATC GTGCAACCCT CdMAMTTTT CTKGKGCrGT TGACAACAGA GGGAGGGAGG 60 

60 
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GAAAACATTT TTYGTGGGAG AATCCTACYT CTGCAGSGGA GCCCTTAAGC GATKGATTTT 120 

GAATCTKGAC CXTTTTACCAA CTAATTTTGA AGGAAGATAC CTTGGAAATA rrTGCCATTC 180 

ACTTGGCnTAC TCAAACAGCA TTAGTrGAATT CATCTAGAGA ACTCTTTCAT TTATTCAGGC 240 

AACAACTCTA CAACTTGGAA ACCTTGTTAC AGTCCAGriTG TGATTTTGGG AARGTATCAA 300 
ctctacack; CAAAGCAGAC AATATTAGGC AGCACTGTGT ACTATTTCTC CATTATGTTA ' 360 

AAGrmCAT CTTCAOGTAT CTGAAAC3TAC AGAATGCTGA GAGTCATGTT CCTGTCXATC 420 

CTTATCAGGC TTTGGAGGCT CAGCTTCCCT CAGTGTTGAT TGATGAGCTT CATGGATTAC 480 

15 TCTTCTATAT TGGACACCTA TCTGAACTTC CCAGTCTTTAA TATAGC5AGCA TTICTAAATC 540 

AAAACCAGAT TAAGGrTTTGA CTGGTTTCAT TTGATTTTTA AG 582 



10 



20 



35 



55 



(2) INFOKMATION FOR SEQ ID NO: 137: 



(i) SEQUENCE CHARACTERISTICS: 
25 (A) LEtKJIH: 1021 base pairs 

(B) TYPE: nucleic acid 

(C) STRANTEDNESS : double 

(D) TOPOLOGY: linear 

30 (xi) SEC^JENCE DESCRIPTION: SEQ ID NO: 137: 

TTCGGCAGAG CCCTTGCGCG CTCTTGAATA CCTGCKTTCT GTAGCGCTAG TTCTCTrCAA 60 

GATTTOCTTA arGICATTTC ATTTCGGmT CTTTTCTCGC CATGTTTTTC TGTCGGAATT 120 

ACGGrTCCTT TTCGTTCTAT C?rACTCTCTA AAATGTTATC GTTTTTCATT TGTCTACTAA 180 

TTrTCGTGCA TTPGITACTA CTGAGTTTCT TAATATCTGA CTGGCCTCCG CCCACGGGCT 240 

40 CTGCAGANCA TAAAATACTC AGGCTGATGG TAGTCCAGAG ACTCTCCCTC CTTGATCAGC 300 

GCAAACGnTG GTCTGAGGCT TGAGGGATGG AGCAACATTT TCTTGGCTGT GTGAAGCGGG 360 

CTTOGGATTC CGCAGAGGTO GCGCCAGAGC CCCAGCCTCC ACCTATTGTG AGTTCAGAAG 420 

45 

ATCOTOGGCC GTGGCCTCTT CCTTTGTATC CAGTACTAGG AGAGTACTCA CTGGACAGCT 480 

GTGATTTGGG ACTGCTTTCC AGCCCTTGCT GGCGGCTGCC pSGAGTCTAC TGGCAAAAOG 540 

50 GACrCTCrcC TGGAGTCCAG AGCACCTTGG AACCAAGTAC AGCGAAGCCC ACrGAGTTCA 600 

GITGGCCGQG GACACAGAAG CAGCAAGARG CACCCGTAGA AKARGTGGGG CAGGCAGARG 660 

AACCCGACAG ACTCAGGCTC CRGCAGCTTC CCTGGAGCAG TCCTCTCCAT CCYTQGGACA 720 

GACAGCAGGA CACCGAGGTC TGTCACAGCG GGTGCCTTTT QGAACGCCGC CATCCTCCTG 780 
CCCTCCAGCC GTGQCGCCAC CTCCCGGGTT TCTCAGACTG CCTGGAGTGG ATTCTTOGOG 



840 



60 TraTTTTTGC CGCGTTCTCT GTACTCTGGG CGTGCTGTTC ACGGATCTGT QGAGCTAAGC 900 
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5 



AGCCTTAGAT AGCAGCAGAA GGCTTTTTGG ATTCTCCTCC TTGAAAAGAT TCTCAGTTAC 960 
CAAACCTTCTC CACCTAGAAA ATAAAAATAC ATTAAGATGT TGA^3AAAAAA AAANAAAAAA 1020 
A 1021 



10 



20 



(2) INFORMATION FOR SEQ ID NO: 138 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1777 base pairs 
15 (B) TYPE: nucleic acid 

(C) STRANDEEINESS: double 

(D) TOPODOGJY: linear 



<xi) SEQUENCE DESCRIPTION: SBQ ID NO: 138: 

CGGAAGATGA TGGCTTCAAC AGATCCATTC ATGAAGTGAT ACTAAAAAAT ATTACTrGGT 60 

ATTCAGAACG AGTTTTAACT GAAATCTCCT TQGGGAGTCT CCTGATCCTG GTGGTAATAA 120 

25 GAACCATTCA ATACAACATG ACTAGGACAC GAGACAAGTA OCTTCACACA AATTGTTTGG 180 

CAGCTTTAGC AAATATGTCG GCACAGTTTC GTTCTCTCCA TCAGTATGCT GCCCAGAGGA 240 

TCATCACTTT ATTITCnTG CTGTCTAAAA AACACAACAA AGTTCTQGAA CAAGCCACAC 300 

30 

AGTCCTTCAG AGGTTCGCTG ACTITCTAATG ATGrTTCCTCT ACCAGATTAT GCACAAGADC 360 

TAAATCTCAT TGAAGAAGTG ATTCGAATGA TGTTAGAGAT CATCAACTCC TGCCTGACAA 420 

35 ATTCCCTTCA CCACAACCCA AACTPGGTAT ACGCCCTGCT TTACAAACGC GATCTCTTTG 480 

AACAATTTCG AACTCATCCT TCAO^TCAGG ATATAATGCA* AAATATTGAT CTGGTGATCT 540 

CCTTCTTTAG CTCAAGGTTG CTGCAAGCTG GGAGCTGAGC TGTCAGTGGA ACGGGTCCTG 600 

40 

GAAATCATTA AGCAAGGCGT CGTTGCGCTG CCCAAAGACA GACTGAAGAA ATTTCCAGAA 660 

TTCAAATTCA AATATGTGGA AGAGGAGCAG CCCGAGGAGT TTTTTATCCC CTATGTCTGG 720 

45 TCTcrrcrrcr acaactcagc AGrroGGCCTG tactqgaatc cacaggacat ccAGCTGrrrc 780 

ACCATCGATT CCGACTGAGG GCAGGATGCT CTCCCACCCG GACCCCTCCA GCCAAGCAQC 840 

CCTTCAACTr CTTTTATTTC TQGGTAACAG AAGTAGACAG ACAGGTTACT TGGTCJTATCT 900 

50 

TCTOTTAAAG AGGATPQCAC GAGTGTGTTT TCCTCACACA CTTTGATTTG GAGAATTGGT 960 

GCTAGTTOGC AATAGATAAC TCAGCGTAGA TAGTATTGCA AAAAGGGGAG GAAATACACA 1020 

55 ACAATAATAA ATGTAAAAAC CTGCTATTCA ACATGCAGTT TTATTTCGAR GCCAAAAATC 1080 

TAGAGCTTTC CCAAGATCCT GTTGCCTTAG isCACATNCAC ACTTCAACAG TGCACACTAT 1140 

CCAACAGIGC ACACTATTCA ACAGTGCACA CTATTCAAAA GCGTAGACTA TTTTTTTGCA 1200 

60 



wo 98/54963 



PCTAJS98/11422 



389 



TOTTCAAGAT ATrTGTTTTG GTCTTATGTG TGTGTGAGAG AGAGAGATTC CTTTGACATT 1260 

AAGGAGCATC AATGAGAAAA GATGATGAGG CAGGAATTAA TAAAGAAATG AAGTCGTGTC 1320 

TCTTTOCTIG CCTGTCAGAG GGCACACAAT TTCATAAACA CXrATGCCTGG ACAATTTGAT 1380 

ATTAATATTT AACACCTCTG CATCTTTTTC TTAAAAAAGA ATATGGGCCA GATACAGTGG 1440 

CTCACATTTC TAATCCCAGC ACTTrGGGGA GCCAAGTTAG CAGAATCCCT TGAGCACAGG ' 1500 

AATCTCAAAC CAGCTTGGGC AACATAGTGA GATCCCATCT NTACAAAAAA CTTAAAAATT 1560 

AGCCAGGCAT GATOGCACAT TCCTGTACTC CTAGCTACTC AGGAGGCTAA GGTAGGAGGA 1620 

15 TTGCXTTCAGC CCAGGAGTTC AAGGCTGCAG TGAGCTAAGN ACGTGCCAGT ACACTCCAGC 1680 

CTCAGCCACA AAGTCAGACC CTGrPCTCXSCA AAAAAAAAAN TTAAAAAGnC GGGGGGGGGC 1740 

CCGGTACCCA AATCGCCGGA TATGATCGTA AACAATC 1777 

20 



10 



25 



45 



(2) mFOKMATION FOR SEQ ID NO: 139: 



TTTTOOTAGG CACGCTCAGT CTGAATGTCC GCCATCTTCT CCTSXXXOMKC TCCTGCAGCC 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 643 base pairs 

(B) TYPE: nucleic acid 

(C) STEWVNDEDNESS: double 
30 (D) TOPOUOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 

r i TT T TTrTT riTm Ti Tr ri Tr mriT tttttttggg aatgagaaaa taactttatt 60* 

35 

TTCArrcjrcG ggagcgggcc GATGTCCAGC CTCAGAACTT CTGGAACTGC TTCTTGGTGC 120 

CGGCftGCCTT GGTGACCTTG AGCACGriTGA AGCGCACTGT CTTGCTCAGA GGCCGGCACT 180 

40 CGCCCACrcr GACGATCTCA CCGATCTCGA CGTCCCTGAA GCAGGGGGAC AGGnCTACAG 240 

ACATOTIdT GTGGCGCTTC TCGAAGCGGT TGTACTTGCG GATGTAGTGC AGATAGTCTC 300 

GGCGGATCAC AATGGrrcCTC TGCATCTTCA TCTTGGGTCA CCACGCCAGA GAGGATCOGC 360 

CCTCGAATGG ACACATTACC AGTGAAGGGG CATTTCrrcT CAATGTAGGT GCCCCTCAAT 420 
AGCCrcCTTC GGGrrGTCTTT GAAGCCCAGA CCX3ATGTTCT TGTTAGTAAC CCGCGGGAGC 



480 



50 TTdCCTTGC CACTTTCTCC CAGCAGGACC CTCTTCTTGT TTTGAAAGAT GGTCGGCTGC 540 



600 



55 



CGGGGGATCC ACTAGTTCTA GAGCGGCCGC ACCGCGGTGG AGC 643 



(2) INFORMATION FOR SEQ ID NO: 140: 

60 
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390 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1220 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: doiible 

(D) TOPOLOGY': linear 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 140: 

GGCACGAGGA TGATAGRCCT ACTGGAGGAA TACATQGTTT ACAGGAAGCA TACCTACATR ' 60 

AGGCTTGATG GCTCATOCAA GATCTCGGAG AGGCGAGACA TGGTTGCTGA TTTTCAGAAC 120 

AGGAATGACA TCmGTGTT CCTGTTAAGC ACACGAGCTG GAGGACTGGG TATCAATCTC 180 

15 ACTGCraiAG ACACAGTQCA TTTTCTATGA TAGCGACTGG AACCCCACTG TGGACCAGCA 240 

GGCCATQGAC AGGGCCCACC GCTTAGGGCA GACAAAGCAG GTTACTGTGT ACCGGCTCAT 300 

CTOTAAAGGC ACCATTGAAG AACGCATTCT GCAAAGAGCC AAC3GAGAAGA GTGAGATTCA 360 

20 

GCGGATGGTC ATTTCAGGTG GGAACTTCAA ACCAGATACC TTGAAACCCA AAGAGGTQGT 420 

TAGTCTTCTT CTAGACGACG AAGAGTTGGA GAAGAAACGT ATGTACTCTA AACCTCTATA 480 

25 CACTCCCCTC ACGTATCTGA GAATGGAAGA QGTACTTGGS TGTGTGCCAA GGGTTAGGCA 540 

AAGCCAGAGG CTGTATTTAG GGAAAGTATT TTTGTGCTCA TATTITATAT AAAAACCCAA 600 

ACAAGAATGT GTTTGTAGGC CAGGCGTCGT QGCTCGOGCC TCTAGTCTCA GCATTTOGGG 660 

30 

ARGCCAAAGT GGGCAGATCA CCTGARGTCA GGARTTTGAG TTTGARACCA GCCTGGCCMA 720 

CGTTGTGAAA CCCCACCTCT. ACTARGARTA CSGAAAATTG GTTGGGCATG C3TGGCGGGCA 780 

35 CCICTAATTC CAGCACITTG GGAGGCTGGG GCAGAANAAT TGCTTGAGCC CAGGAGGTGG 840 

AGATTGOGGT GAGCCGAGAT YCTGCCATTG CAtTTCCAGCC SGGGCAATAA GAfTTGAAAYT 900 

CCATCTTTTA AAAACAAACA AAAACAAAAA ACACAAGACG GCTCACACCT CTTAATCCCAG 960 

40 

CACnTOGGA RGCCGARGCA GC3TGGATCAC GARGTCAGGA GTTCCAAGAC TAGCCTGGCC 1020 

AACCTCGTCA AGCCCCGTCT CTACTAAAAA TACMAATATT AGTOXSGCGT GGTQGTGGGC 1080 

45 ACCTGTAATC CCAGCTACTC QGC5AGGCTGA GGCAGGAGAA TCCCTTGAAG CTAGGAGGCA 1140 

GAGGTTGCAG TGAGCCAGGA TCGTGCCATT GCACTCCAGC CTGGACAACA AGAGCAAGAT 1200 

TCCATCTCAA AAAAAAAAAA ^220 

50 



{2) INFX)R61ATiaN FOR SBQ ID NO: 141: 

(i) SBC^JENCE CHARACTERISTICS: 

(A) LENGTH: 721 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: 

AATTCGGCAC GAGCCAGGTT AGCCGGAAGG GCAGCTCTCC AGGCCCTGCC CACCCCACAG 60 

5 

cxxccvccrr aoxxacagcg oggcgtctcc arorcGccAT agaaacggaa crGGcrcrrr 120 

TCAACAGTGC TGCAAGAGGA TGOrTATTTA ACGCTGGCCC CCAAGGAGGA AAGGCACAGA 180 

10 CYTTCCTCCC TCCIX3GAACA TCCAAGGGCA CTGGATCCTC TCTGrCCCTC TGAGATGGGG 240 

TGCCACTCCA GCAAGAGCAC CACGGTGGCA GCIGAGTCCC AGAAGCTTGA AGAAGAGYGC 300 

GAGGGAAGAG AGCCAGGTrCT GGftGACCGGC ACCCAGGCAG CAGACTGCAA GGATGCCCCG 360 

15 

CTCAflGGATG GAACCCCTGA GCCAAAGAGC TGAAATGCCT CTCTCCAGAG TCGGACCCTC 420 

ACCTCYTTCC TGGAACTGCC rTTGGCCCCA GAACCATGAG ACAATCCCCA CCCTGAGAAG 480 

20 CTCCGATCAC TOGGAGGAGA GAGAAAGCCT CCAGCTTTGG GATTCAQGCT TCAGAAGTTT 540 

TTAGCAGCCT TTGCTCATTC GAGAGGTGGG GAAAGGATAA AGnTCTTATA AGGAAATCCC 600 

TAATTTCCCC CAGCTCCTCC CCNCCNGAAG AAGGAACNAA AGAAAGTTCC TTCCACACGT 660 

TTTGTTCGAA ACrrTTCCCT TGCCAACTIT CCTTGGATTG CCAGAACAAA GCCCTCCAGA 720 



25 



30 



A 



(2) INFORMATICN FOR SEQ ID NO: 142 



721 



35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1468 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

40 

(xi) SEQUETKE DESCRIPTION: SEQ ID NO: 142: 
AOGAATTAAT GTTTATAAAT GACTGrTACTG AATTTAAAAC CGTACAGTTT CATTTGCATT 60 
45 TTGACATTAC TTTATTATAC ATTTTGCATT TAAAAGGCTG CACCAGTTGG CmTCTTCT 120 
CrriTr Al TCT CAAAATATAG AGATTCTGTG ATTTATTTGC OCTGmTATG GATTAAAAAG 180 
AAAATTCTAA TATAAAGCAT TTCAATAGGA TGCATAGGrTA TATTACGTTT TTTAAATGCT 240 

50 

TTAGATCTGT GATTCTTGAC TTACTATTTA TTTTATCCCC TTTAAGTCAG GGATGCrTTA 300 
TTCTATTTTA AAGCACTTAT GAGTTACATG TTGTAATCAA GTTTGCACAA TATATTTATC 360 
55 TATA-TCAGGA ACCCATAAAT GAATAGCTAA TTTTTAAAAT GCCA1TAAAA TGCATGAAAT 420 
KCTTATTAAA ACCTTACTAT ACTATTTCTT CAAGGCAAGT AAATTGACCA TGRGRAAAGR 480 
ACACAGTTAT TAAACACTGT TGACAGGAAA ATTCTCCTTG ATAACATAGG ACAATTAATG 540 

60 



wo 98/54963 



PCTAJS98/11422 



392 



GAAAAAAAAA TTCTCATTAT TTGCAAAGAA TGAACAAGTT AATGAACAAA CAAACTAGAT 600 

TTCCTTATOTT TTCAGCTTTT GTATCATGTT TAATTGTTTA ATTTGGTTGA AAAACTGCAG 660 

5 TTCAGAAATC AGATAGCAAT ATAGACATTC ACAGCAGCTC TGTGGATACC ATGTAATTGT 720 

CAGC7EAATTT CAGAATGriTG AAAATTATTC AGTGCAGCCX: TCATAGTATC ATACTTGAAG 780 

AAATTCATTA CAGTTCCACT AAATTGTTGA AGATAAATTA TTTTTAAAGG TTATGAAAAC 840 

10 

TAA£?ITATAT TAATTCATAT GTTTGATTTT TAAATCCCAC CTCCTCAAGC TATCCAATTT 900 

NCTGACTTTG AAAATAACCA TGAGAGATGC CACATTTCTC TCTGGGAAAC TACCACTCAA 960 

15 AGAATAATTG TTAAAAATTA AGCTTTTAQG TATTAGAAGC TGTTATAAAG TATAAAATTA 1020 

AGATATAAGC AGATCACATG TAAATCATTC CTAAAGCACA AGAAAAGAAT GTGCCTTGAT 1080 

GTACATATAT TACTAAGTTG CCTCTCCCAG TTTACTTTAA AAATGGCTTT AAGGATAAAG 1140 

20 

AATAAATGTG ATAGCTGTGC ATGCATTATA TATTTGCATT TQCAAATTTC CCAITGTnT 1200 

AACAGCTCTG TGGCTGACTT TCAATTTTAA GACGTGAATT GACATACAGC CCATAACTTT 1260 

25 ATAATGGCTC CTCATTTATC TTATCTTTCA GTTACTQGAA AAACATTTCA ACCTGACTAA 1320 

AATTTGGAAT TGTGTCTTTT ATGTTCCATC CTCTGTTGTT ACTAGATTTA GTTTAAAAAT 1380 

TGTGTATGAC CATTAAaXSTA TGTCATAAAC ATGTAAATAA AASftTGTTGA ATCTTGrrGA 1440 

30 

AAAGCAWRAA AAAAAAAAAA AAACTCGA 1468 



35 

(2) INFORMATION FOR SEQ ID NO: 143: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LQIGTH: 300 base peiirs 
40 (B) TlfPE: nucleic acid 

(C) STRANDECNESS: double 
{D) TOPOUCT: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 143 : 

45 

TGAATTTTTT GCCAAACTTA GTAACTCTGT TAAATATTTG GAGGATTTAA AGAACATCCC 60 
AGTTTCAATT CATTTCAAAC TTTTTAAATT TTTTTGrrACT ATGTTTGGTT TTATTTTCCT 120 
50 TCTOTTAATC TrTTGTATTC RCTTATGCTC TCGTACATTG AGTACTTTTA TTCCAAAACT 180 
AC?rGGGTTTT CTCTACTOGA AATTTTCAAT AAACCTGTCA TTATTGCTTA CTTTGATTAA 240 
AAAAAAAAAA AAAAAAAAAA AAACCCCNAG GGGGGGGCCG GGTNCCCAAT CCCCCCCAAA 300 

55 



60 



(2) INFORMATIC»I FOR SEQ ID NO: 144: 
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(i) SE^^JENCE CHARACTERISTICS: 

(A) LENGTH: 2243 base pairs 

(B) TYPE: n\icleic acid 

(C) STRANDEDNESS : double 
5 (D) TOPOLOGY-: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144: 

TGCCTCCCrr cctgcagatt gtggacagta gttoctcagc ctgcaccctg gattccttct 60 

10 

tccccttcct agctccatgg gactcgcccc aagactgtcg cttcaaggac caccagcccc 120 



TTACTCTTCA AGCCCTGACT GTGGACnTQG TAGATGCCTC TGATCCTCAG TATTCTCTCT 180 

15 GGCAATGTTC CACGGCTTCT CCTTCCTGGG AGCTGGCTCC ATAACTTGAT TTTCCCCAAA 240 

CGTGTTOCAA TCCCTGCTGC CCCTTAGCCA CCCAGGGTCT TC3TC3TGQC5TA TGAGTGTAGA 300 

GGATGGGGGT ATGCCAGGCC TGGGCCGTCC CAQGCAGC3CC OGCTQGACCC TGATGCTACT 360 

20 

CCTATCCACT GCCATGTACG GTGCCCATGC CCCATTGCTG GCACTGTGCC ATGTGGACGG 420 

COGAGTGCCC TTYCGGCCCT CCTCAGCCGT GCTGCTGACT GAGCTGACCA AGCTACTGTT 480 

25 ATCCGCCTTC TCCCTTCTGG TAGGCTGGCA AGCATQGCCC CAGGGGCCCC CACCCTGGCG 540 

CCAGGCTGCT CCCTTCGCAC TATCAGCCCT GCTCTATGGC GCTAACAACA ACCTGGTGAT 600 

CTATCTTCAG CGTTACATGG ACCCCAGCAC CTAOCAGGTG CTGAGTAATC TCAAGATTQG 660 

30 

AAGCACAGCT GTGCTCTACT GCCTCTQCCT CCQGCAOCGC CTCTCTGTGC GrTCAGGGGTT 720 

AGCGCTGCTG CTGCTGATGG CTGCGGGAGC CTGCTATGCA GCAGGGGGCC TTCAAGTTCC 780 

35 CGGGAACACC CTTCCCAGTC CCCCTCCAGC AGCTGCTGCC AGCCCCATGC CCCTGCATAT 840 

CACTCCGCTA GGCCTGCTGC TCCDCATTCT GTACTQCCTC ATCTCAGGCT TGTCGTCAGT 900 



GTACACAGAG CTGCTCATGA AGCGACAGNG GCTGCCCCTG GCACTPCAGA ACCTCTTCCT 960 

40 

CTACACTTTT GGTGTGCTTC TGAATCTAGG TCTGCATGCT GGCGGCGGCT CTGGCCCAGG 1020 

SCTCCTQGAA GGTTTCTCAG GATGQQCAGC ACTOGTGGTC CTGAGCCAGG CACTAAATQG 1080 

45 ACTGCrCATG TCTGCTGTCA TGAAGCATGG CAGCAGCATC ACACGCCTCT TTGTQGTGTC 1140 

CTCCTCQCTG GrGGTCAACG CCGTGCTCTC AGCAGrTCCTG CTACGGCTGC AGCTCACAGC 1200 

CGCCTTCTTC CTGGCCACAT TGCTCATTGG CCTOGCCATG OGCCTGTACT ATGGCAGCCG 1260 

50 

CTAGTCCCTG ACAACTTCCA CCCTGATTCC QGACCCTGTA GATTGGGCGC CACCACCAGA 1320 

TCCCCCTCCC AQGOCTTCCT CCCTCTCCCA TCAGCAGCCC TGTAACAAGT GCCTTGTGAG 1380 

55 AAAAGCTGGA GAAGTGAOQG CAGCCAGGTT ATTCTCTGGA GGTTGGTGGA TGAAGGGGTA 1440 



CCCCTAGGAG ATGTGAAGrTG TGGGTTTGGT TAAGGAAATG CTTACCATCC OCCACCCCCA 1500 
ACCAAGTTCT TCCAGACTAA AGAATTAAGG TAACATCAAT ACCTAGGCCT GAGAAATAAC 1560 

60 
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10 



15 



20 



CCXIATCCTTG TTGGGCAGCT CCCTGCTTTG TCCTGCATGA ACAGAGTTGA TGAAAGTGGG 1620 

GTGTGGGCAA CAAGrTGGCTT TCCTTGCXTTA CTTTAGTCAC CXIAGCAGAGC CACTGGAGCT 1680 

GGCTAGTCCA GCXXAGCCAT GGTGCATGAC TCTTCCATAA GGGATCCTCA CCCTTCCACT 1740 

nCATCCAAG AAGGCCCflGT TGCCACAGAT TATACAACCA TTACCCAAAC CACTCTGACA 1800 
GTCTCXrrcCA GTTCCAGCAA TGCCTAGAGA CATGCTCCCT GC(XTCTCCA CAGTQCK3CT ' 1860 

CCCCACACCT AGCCTTTGTT CTGGAAflCCX: CAGAGAGQGC TGGGCTTGAC TCATCTCAGG 1920 

GAATGTAGCC CXTOGGCCCT GGCTTAAGCC GACACTCCTG ACCTCTCTGT TCACCCTGAG 1980 

GQCTCyrCTTG AAGCCCGCTA CCCACTCTGA GGCTCCTAGG AGGTACCATG CTTCCXACTC 2040 

TQGGGCCTGC CXXTTGCCTAG CAGTCTCCCA GCTCCCAACA GCCTQGGGAA GCTCTGCACA 2100 

GAGTGACCrc AGftCCAGGTA CAGGAAACCT GTAGCTCAAT CAGnCTCTCT WTAACTQCAT 2160 

AAGCAATAAG ATCTTAATAA AGTCTTCTAG QCTGTAGGGT GGTTCCTACA ACCACAGCCA 2220 

AAAAAAAAAA AAAAAAACTC GAG 2243 



25 



(2) INFORMATION FOR SBQ ID NO: 145: 

30 (i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 1082 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

35 

(xi) SBQUEJICE DESCRIPTION: SBQ ID NO: 145: 
GCCAAGCTCT AATACGACTC ACTATAGGGA AAGCTGGTAC GCCTGCAfaCP ACCGGTTCCG 60 
40 GGAATTCCCG GGTCGACCCA CGCGTCCGCT TCOGTGTGTC AAAATCCTCA CCTCCTTCAT 120 
AACCATCTCC CACAATTAAT TCrTGACTAT ATAAATTTAT GGTTTGATAA TATTATCAAT 180 
TTGTAATCAA TTGAGATTTC TTTAGTGCTT GCTTTTCTGT GACTCAACTG CCCAGACACC 240 

45 

. TCATPGTACT TCAAAACTQG AACANCTTGG GAATGCCATG GQGTTTGATA ATCTGCCAGG 300 
GACATGAAGA GQCTCAGCTT CCTGGGACCA TGACTTTQGC TCAGCTGATC CTCaiACATGG 360 
50 GAGAACAACC ACAnTTTCT TTGTGTGrTGC TTCTAGCAGC TGTTCGGGAG GACCKTGACC 420 
CAAYAGTOTT CCCATGCTGT TTCrTGTGAA ATGCTCTCGG CTATGTAGCA GCTTTTGATT 480 
CCCTGCATAC CCTAGGCTQC TGCCCCTATC CTGTCCCTTG TTTATAACAT TGAGAGGTTT 540 

55 

TCri\GGGCAC ATACTGAGTG AGAGCAGTGT TGAGAAGTCG GGGAAAATGG TGACTACTTT 600 
TAGAGCAAGG CTGGGCATCA GCACCTGTCC AGCTCTACTT GTGTGATGTT TCAGGAACTC 660 
60 AGCCCCTTTT TCTGCCTAQG ATAAGGAGCT GAAAGATTAA CTTGGATCTY CTAATGGTCC 720 
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AAATCTTTTG GTCACAATAA AGAGTCTCCA AATTAGAGAC TGCATGTTAG TTCTQGATGG 780 

ATrrcCTTCGC CTGACATGAT ACCCTGCCAG CTGTGAGGGG ACCCCX3TTTT TAAGATGCAT 840 

5 

GGCCAAGCTC TCTGCAAATG GAAATGCTTA CACTOSGrTGT TGGGGATGTT TGCTACCTCC 900 

TCCTATTTTT GTQGTTTTGG TTCTCCCACT ATGGTAGGAC CCCTGGCCAG CATXTQGCT 960 

10 TOTCATCTCA GCCCCATTCA CTACCTTCTC ATCCTCTGAC5 C3TACTACTGC CTCTGCAGCA 1020 

CAAATTTCTA TTTCTCTPCAA TAAAAGGAGA TGAAAATAAA AAANAAAAAA AAAAAACTCG 1080 
NG 

15 



20 



50 



(2) XNFOKMftTION FOR SEQ ID NO: 146: 



1082 



(i) SEQUENCE CHARACCERISTICS : 

(A) LENGTH: 4313 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS ; doiable 
25 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRTPTION: SEQ ID NO: 146: 

CAAGCTGOTT TGAAACTAGG GGTCGGGCTC GGCCGTCGTC GTTGTTTGTC GCCGCATCCC 60 

30 

CGCnCCGGG TTAGGCCGTT CCTGCCCGCC CCCTCCPCTC CTCCCTTCGG ACCCATAGAT 120 

CTCAGGCTCG GCTCCCCGCC CGCCGCAGCC CACTGTTGAC CCGGCCCGTA CTGCGGCCCC 180 

35 GTOGCCACCA TGTCCCK3CA CGQCAAACGG AAGGAGATCT ACAAGTATGA AGCC-CCCTQG 240 

ACAGTCTACG CGATGAACTG GAGTOTGCGG CCCGATAAGC QCTTTCGCTT QGCC-CTGGGC 300 

AGCTTCGTGG AGGAGTACAA CAACAAGGTT CAGCTTGTTG GTTTAGATGA GGAGAGTTCA 350 

40 

GACrTTATTT GCAGAAACAC CTTTGACCAC CCATACCCCA CCACAAAGCT CATGTGGATC 420 

CCK5ACACAA AAGGCCTCTA TCCAGACCTA CTGGCAACAA GCGCyTGACTA TCTCCGTGTG 430 

45 TCGAGGGTTG GTCAAACAGA GACCAGGCTG GAGrTGTITGC TAAACAATAA TAAGAACTCT 540 
GATTTCTCTG CTCCCCTGAC CTCCTTTGAC TGGAATGAGG TGGATCCTTA TCTTTTAGGT 
ACCrCAAGCA TTCATACGAC ATGCACCATC TGGGGGCTGG AGACAGGGCA GGTGTTAGGG 

CGAGIGAATC TCGTCTCTOG CCADGTCAAG ACCCAGCTGA TCGCCCATGA CAAAGAGGTC 720 

TATGATAITC CATTTAGCCG GGCCGGGGGT GGCAGGGACA TGTTTGCCTC TGTCGGrQCT 780 

55 GATGGCTCGG TOCGGATCIT TGACCTCCGC CATCTAGAAC ACAGCACCAT CATTTAOGAA 840 
GACCCACAGC ATCACCCACT GCTTCGCCTC TGCTGGAACA AGCAGGACCC TAACTACCTG 



600 
660 



900 



60 



GCCACCATGG CCATGGATX3G AATGGAGGTG GTGATTCTAG ATGTCCGQC3T TCCTGCACAC 950 
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60 



CICTSGCTAG GTTAAACAAC CATCGAGCAT GTGTCAATGG CATTGCTTGG GCCCCACATT 
CATCCTCCCA CATCTGCACT GCAGOQGATG ACCACCAGGC TCTCATCTGG GACATCCAGC 
AAATOCCCCG AGCCATTGAG GACCCTATCC TGGCTTACAC AGCTCaJAAGG WGiUSATCAAC 
AAKTTCCAGT GGGCATCAAC TCAGCCCGAA YTGTCGCCAT CTQCTACAAC AACTGCCTGG 
AGATACTCAG AC?rCTAGTGT TGC?rGGCGCT GTGCCCACGA GGCAGGGGCT rrrGTATrTC 
CTGCCPCTCC CCCACCCCCA AflGTAAGAAG AAACATOTTT CCAGTGGCCA GTATGTCnT 
CATTGCTTTG CACCCACTGT TACCAGAAGC TGCTCTAGGA GTTCCTGGCC AGTCACCCCA 
TCGCCCTCrc TOGCAGACTC AGTQCTGrTGT GGCGCCTCCT CAGCXX^GGG CTGAGTrTTA 
AGATTTICTC TCXTTTTCCTC TTCTCCTTTG GTTCCTCAAT TAAAAAATGT GTGTATATTT 
GrTTCTCAGG CGTTCTCrTG AGGAGCAGTT CACGCACTGG CTGTGTCTAT TCCTCTGCCC 
AGGTCTCTCT GrtTQCTGCC CAAKGYWKKT TTTCATGTCT CGTCCATGTC CATGTTCGTC 
TTAGCACTWA CGTQGGAACA AATACCAATT TGTCTTTrCT CCTAGTATCA GTGrrGTTTAA 
CAAATTTTAA CmGTATAT TTGrTTATCTA TCAGGCTAAT TnTTTATGA AAAGAAiriT 
ACTCTCCTCC TTCATTTCTT TGrrCTTATAG TCCTCCCTCT TTGCACCTTC TTCTCTTCCC 
TCACTXCTG GftGCrGGTAC TGGGCCCCTG GCCCCATGAG CAGTTTGCCT TCTTGAGTrCA 
CTCCCTCTGT ACTACATACC TGACCQCX3AG TCCAAACCAC CrTGGTGCTC TGAAGTCCAC 
TCACTCATCA CACCTTTCTT AGCCTGGCTC CTCTCAAGGG CATTCTGQGC TTGTAAACAG 
ACATAGGAAG CXrrCTGTTTA CCCTGAAGCA CCACrGTCCA GCCCATTGGrT TCCCACTQGC 
AGGATGCTAG AGCTGAGAGA AACMQCTCT GAGGCTACCT GACTTGAGGG GAATCGTTTC 
ATGAAGCTGA ACrTTCAAGCA lOTTTCCACT ACATTCTTTC AGAGTCTGTT TTTCCATCCA 
AATATAAGCX: CCAGGCCATT CCACTTAGTG TCTTTTCAAT GATAGGCAAG AATGATATCT 
GAOTIGAACT TCGGTGCTTC TCTTOTrPGA GmACTGTG CXTOGTGGTA TATTGGGCAT 
TCTTTCGATT GAGrTGTTCTG AGGTCAGAGA GTCTTCCCGA QGCATCCTGT CTGTGCTTCC 
AACCCTGAAC AAGACCTTAC ATGAGAGAOXS GACTCA3X3GA CTGCGGCAAT CCTGGGCTGT 
CAAGTGGATA GATAGTTAAA AAGCATTATA CTGnOGGTAA TGAAAAGGGA GGAAAAAAAA 
AGAAGGAAAA GGAATTATAG ACCCCCAGGG TCAGCCAGTT AAGAGCTCTA CCCACACCTG 
TCAACCCCTC TCTCCCCCAG TTTAGGrTTCT GAGCAGTATT GGACTTCTAG CCTGCAGrTTG 
TCTTTTGACT TQCAGGCCXX: AGrTGTCTTTC TGTTATGTGA ATGAGTTCCA TGGAGGGGCA 
TATOrGTCAT TCCACCGTTA GATGAGCCCT TGGOGCAGGC AGTTTGGGAT GTGCTCTTOG 
GGGAAAGTTO GCTCrTTrCCT TGCGCTCTGC TCCTACCCXiA AGTTTTTAAG TCCCTCTGAA 
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TTCCTCATCT GAGATTAGTA GAGTAGCAGG CCTGAAGGAT GATGGTTTTG 
TTCTCACCTC CTTGAGAAGT AAAACflGTAA CITTGTTCTT CTGGGCCCTT AAGCTTTTTT 
GGITAAGTCT TCCTTTTCAG AAGTAGATGT CATTATATGC CAAAAGTCTA GCTCTTTGCT 
TTACCATACA GGGACCTGTC CCAAAGAAAA AGGCPCTTTT TTTAGCCAGC ATATTTCCCX 
TTCTACCCTT TTACTTTGTT GTTCTGATTT TAGGACTCTG GCTCGCCATG TGCTTGrrGGT 
TCCCTCTCCT GCATPTGCCA CTGGATTTGC ACTGCATCGT TTGGAGATAC AAAGCGAGCA 
CTTCrTGGTC AGAACCCTCC TCTCCTTTTC ATrGrCTTTG ATAATGGTTA CTGGGTCCTT 
CICTCAAGCSG TAGCAAGGCC AAGCTGATQG CTGCrTGTTT AGGAGGCCAT CAGTTCCTTC 
CTCTGGAGAA GGGTCTGAAA TGGAAGTCAG TGGTAGAAGG GGCTGGTCTG CTGGGCAGQG 
CTTACATCCA CTCAGTTCTA AGATTCCTTT CCTGATCTGC ACCTACGCCT GGTCTGTATG 
GTCGAATTrG TCAGCTOGAA CTCAGAAACA ACAACTPGAA AAAAAAATAA TAATTAGAAC 
ATATTTCCAT AAGATAGCTA TITACTCTQG AAACXMCAA CTTTTGAGAT TTCCCTTGOC 
CTOTGGACGC CCAGCTCCTG TCATCCTTCC TTAGGTCCTG CAGTACAGTC TTCCCCTGAA 
TGCCACCXXSG GACCCAGGGG GACTCCACCC CXXTEAAGCAA GCACACACAT ACTCACAGTT 
GATGAGTTGC TGCTCTITGA GTCCCAGCTC TCTTACCCTC CCnTACTCC ACCAGCCCGA 
CXSACCCATCA CTGAGGAGGG GATTTCTACA GTCTCAQGAT TTAGAAAGTC OXSTAAGCCAT 
CCATGCTCCA GAAAGCACCG ATCTGTTGTA GTTGCAAAAA C3\ACTCTGTA ATTTGTTGAG 
GTTCTCAAAC TGACAGCCAG CGAGftCTGGG TGGGAGQCCC TGGATCTGTT CTCCCTGACT 
GCGGGAGGAG CAGCCACTAG GACTTTAGCA GGAAGCCCAC ATQGAGGCTC CGCCAGGCTG 
TOGCCCAGCT GGrTGAIQGCC CTTTTGCTCC TGGCAGCCTG AGGCACAGCT GCCTGTATTG 
TCCTCATCTG TTCTCACTGA AGGATGGAGG TQCTGAATAA ATTAGGCCTC AGGCOTCTAC 
CACCAGflGAG CTOGAGAATG GGTCCACGTC ATTCAAGGAC CTGAA'iTlTT TATGCTCAGG 
AGCATTGGAA TCCTCTTCTT CCAGQGAGGA ATTASCCTGC AAGCrTTAGGA CTTGAAGAGG 

GAAGCTArrr aataactggg cgaggatggg tgtqgtggct cacacctgta atcccagcat 

TTK3GGAGGC TGAGGnX3GCC AGATCCCAAG GTCAGAAGAT CGAGACCATC CTGGCTAACA 
TQGTCAAACC CXIATCTCTAC TAAAAATACA AAATTAAATT GGCCGGGCGT GAA 
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(2) INFORMATION FOR SEQ ID NO: 147: 

(i) SEQUENCE CHARACTERISTICS: 

(A) - LENGTH: 1183 base pairs 
60 (B) TYPE: nucleic acid 
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(C) STRANDECaSIESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147: 

5 

QGCAGAGCCT CAAGCTGACT TGGATTA3CT GGTCCCTCAA ATCTACCGAC ACATGCAGGA 60 

GGAGTTCCGG GGCCGGTTAG AGAGGACCAA ATCTCAGGGT CCCCTGACTG TGGCTGCTTA 120 

10 TCAKWYGGGG AGTOTCTACT CAGCTGCTAT GGTCACAGCC CTCACCCTGT TGGCCTTCCC 180 

ACTTCTGCTG TTCCATGCGG AGCGCATCAG CCTTGTGTTC CTGCrrCTGT TTCTGCAGAG 240 

circcrrcTC CTACATCTGC TTGCTGCTGG GATACCCGTC ACCACCCCTG GTCCTTTTAC 300 

15 

TCTGCCATGG CAGGCAGTCT CGGCTTGGGC CCTCATGGCC ACACAGACCT TCTACTCCAC 360 

AGGCCACCAG CerGTCTTTC CAGCCATCCA TTGGCATGCA GCCTTCGTGG GATTCCCAGA 420 

20 GGCTTCATCGC TCCIGTACTT GGCTGCCTGC TTTGCTAGrrG GGAGCCAACA CCTTTGCCTC 480 

CCACCTCCTC TTTGCAGTAG GTTGCCCACT GCTCCTGCTC TGGCCTTTCC TGTGTGAGAG 540 

TCAAGGGCTG CGGAAGAGAC AGCAGCCCCC AGGGAATGAA GCTGATGCCA GAGTCAGACC 600 

25 

CGAGGAGGAA GAGGAGCCAC TGATGGAGAT QCGQCTCCGG GATGCGCCTC AGCACTTCTA 660 

TGCAGCACTG CTOCAGCTGG GCCTCAAGTA CCTCTTTATC CTTGGTATTC AGATTCTGGC 720 

30 ctctgccttg gcagcctcca tccttcgcag gcatctcatg gtctggaaag TGrrrrccccc 780 

TAAGTTCATA TTTGAGGCTC TQGGCTTCAT TCrTGAGCAGC GTGGGACTTC TCCTGGQCAT 840 

AGCTTTGGTG ATCAGAGTGG ATGGrrGCTGT GAGCTCCTGG TTCAGGCAGC TATTTCTGGC 900 

35 

CCAGCAGAGG TAGCCTAGTC TGrTGATTACT GGCACTTGGC TACAGAGAGT GCTGGAGAAC 960 

AGECTAGCCT GGCCTOTACA GGTACTQGAT GATCTGCAAG ACAGGCTCAG CCATACTCTT 1020 

40 ACTATCATGC AGCCAGGGGC OGCTGACATC TANGACTTCA TTATTCWATR ATTCAGGACC 1080 

ACAGTCGAC?r ATGATCCCTA ACTCCTGATT TGGATGCATC TGAGGGACAA GGGQGKCGGT 1140 

STCCGAAGTG GAATAAAATA GGCGGGaTTG GTGACTTGCA CCT 1183 

45 



50 



(2) INFORMATION FOR SEQ ID NO: 148: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 734 base pairs 

(B) TYPE: nucleic acid 

(C) SrniANDEDNESS : double 
55 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 148: 



60 



GAATTCGGCA GAGTGAAGCA TTAGAATGAT TCCAACACTG CTCTTCTGCA CCATGAGACC 



60 
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(2) INFORMATION FOR SEQ ID NO; 149 



399 



AACCCAGGGC AAGATCCCAT CCCATCACAT CAGCCTACCT CCCTCCTGGC TGCTGGCCAK 120 

GATOTCGCCA GCATTACCTT CCACTGCCTT TCTCCCTGGG AAGCAGCACA GCTGAGACTG 180 

GGCACCAGGC CACGTCTCTT GGGACCCACA GGAAAGAGTG TGGCAGCAAC TGCMTGOCTG 240 

ACCmCTAT CrrcrCTAGG CTCAGGTACT GCTCCTCCAT GCCCATGGYT GGGCCGTGGG 300 
GAGAAGAAGC TCTCATACGC CTTCCCACTC CCTCTGGTTT ATAGGACTTC ACTCCCTAGC ' 360 

CAACAGGAGA GGAGGCCTCC TGGGGTTTCC CTRRQGCAGT AGGTCAAACG ACCTCATCAC 420 

ACyrCTTCCTT CCTCTTCAAG CGmTCATGT TGAACACAGC TCTCTCCRCT CCCTTGTGAT 480 

15 rrcTGAGGcyr CACCACTCCC ARCCTCAGGC AACATAGAGA GCCTCXTOTT CTTTCTATGC 540 

TTCGTCTCAC TCAGCCTAAA GTTGAGAAAA TQGGrTGCCAA GGCCAGTGCC AGTGrTCTTGG 600 

GGCCCCTTTC GCTCTCCCTC ACTCTCTGAG GCTCCAGCTG GrTCCTGGGAC ATGCAGCXAG 660 

GACrOrcAGT CIGGGCASGT CCAAGGCTtG CACCTTCAAG AAGTGGAATA AATGTQGCCT 720 
TTGCTTCTAT TTAA 



734 



30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1405 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

35 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 149: 
GGCACAGTGG ACCCCAGACT CCCTCTCCGC CTTTCTCTGC CTGGGGAGAC CCACTGTGTG 60 
40 CATGGCATCA CTGACTCCCA TACCTCTGGC TATCAAAGGT TTCTGCCATG GCCACCCTGG 120 
AAGSAAACCA GAOQGAGGTA GACAGGGAGA TCAGGTCCCT TCTACTCTGG TTCCTGCTCT 180 
GTCAAArrcr CTCAGGCTGG CTGTGTCCAG ARGGTCCCTG GTTCTCTCAR GGATGCCAAA 240 

45 

TCTACAAGAA TCTCTCCTCT TCCAGTTCCT ATAACCTCTC CTTCCnTTG TCTCTTTAGA 300 
ccttggagta GTAGCAGCCA GGTTCTTTCT ATCTCTGGGT TAGTGCATTA TCTCTGGTQG 360 
50 CTCCCTTACC CAGGACTTTG GGAATGGTCT TTTTGTAATA GATTCrCCTC AAATAATTCA 420 
ATTTTGAGTO rrCTOTATGT ATCCTGCTQG GAGGTrGTTA TATACAAATC ACTGTGCCCG 480 
TTTAGCAGAG AAGGAGACTG AAGCTCAGGG AGGTTAAGTG TCTTTCTCTA GGTOGrrATTG 540 

55 

TGGAGAAAGT GGCTCACTQG GGACTTGAAT GAGGTCCCTA GrTTTCATGCT CGGAGGGCAA 600 
AGANGAATCT CCAATTGGCC TGAGATAAGC CTCTGGTAAA ATGTACTGTA CATAATAGGT 660 
60 AATCAATAAA TGTTGGCTGA TGACAAACAT GTTTTCTTTG TrCATTAGTT ATAGTGATTA 720 
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TCTTCTAAAT AACTCCMACA AGGAARTCAG CACATTTGGA ATATCAWTAT CTTTCCATGA 780 

TAATATCTTT CCMyOGAAAG AWAATGATAT TCCMAACTGG GAGTGTCCO* AGCARATCTG 840 

5 

ANTCTOTOTA TTCGCCCTGG GGTGGGCCAG CCCCTTAGAC TCTATGGTCT C\TTCTCTTT 900 

GTTTACAAAA TTGAGATAAG GCCTTATTCT CTCCCCACCC CACCCATCCA T>,rl^l-rri^5 960 

10 AGAATAAAAT GAGAGGATCT GTGTCAAGGG TGTATTITCG CAATAGTCTC TSAGCCATTT 1020 

TCTCAGCACC TCCATACTGT TGACACTCAA GTAATATTTC ATCAGCATTC CATICAGGt3T 1080 

CCTCCCTTAA 'TCAGGTCTCC GATGTTACAAG AC3TYGTC5AGG TGGCAAAGGA TOSGCTCCTG 1140 

15 

AGGAAACACT TAGGAAACTG GGCTTTCTGC CATTAAAAGA GACAAACCTT TGTGGTGACC 1200 

TAATTAAAOT TTTTAAAATr CAATTTGGAA AGTTAGCAAG CTAGCTCCTK TCCAGGWAAA 1260 

20 ATAAGGAGTC AGTCCATGAC CTAACCGGTC CCGGGCTCCT TGCCATTCCA AACAACTGCA 1320 

CTAAGTTTAT CACOTTCTTT CAGGGACTGA GGTTTCCAGG CACAGACTTG a^TAAGGAAG 1380 

GATGTCCTAT GGGGTCACAT TGATG ^^05 

25 



30 



(2) INFORMATION FOR SEQ ID NO: 150: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2890 base pairs 

(B) TYPE: nucleic acid 

(C) STRAWnEENESS: double 
35 <D) TOPOUOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150: 

TTATATGCTA CAGCTACAGT AATTTCtTCT CCAAGCACAG AGGANCTTTC CCAGGATCAG 60 

40 

GGGGATCGCG CGTCACTTGA TGCTGCTGAC AGTQGTCGTG GGAGCTGGAC GTCATGCTCA 120 

AGIGGCTCCC ATCATAATAT ACAGACGATC CAGCACCAGA GAAGCTGGGA a-.CTCTTCCA 180 

45 TTCGGGCATA CTCACTTTGA TTATPCAGQG GATCCTGCAG GTTTATQGGC ATCAAGCAGC 240 

CATATOGACC AAATTATGTT TTCTGATCAT AGCACAAAGT ATAACAGGCA AAATCAAAGT 300 

AGAGAGAGCC TTCAACAAGC CCAGTCCCGA GCAAGCTGGG CGTCTTCCAC AGGTTACTQG 360 

QGAGAAGACT CAGAAGGTGA CACAGGCACA ATAAAGOGGA GGGGTGGAAA GGATGTTTCC 420 

ATTCAAGCCG AAAGCAGTAG CCTAACGTCT GTGACTACGG AAGAAACCAA GCCTGTCCCC 480 

55 ATGCCTGCCC ACATAGCTCTr GGCATCAAGT ACTACAAAGG GGCTCATTGC ACGAAAGGAG 540 

GGCAGGTATC GAGAGCCCCC GCCCACCCCT CCCGGCTACA TTGGAATTCC CATrACTGAC 600 

TtTCCAGAAG GGCACTCCCA TCCAGCCAGG AAACCGCCGG ACTACAACGT GGCCCTTCAG 660 

60 



50 
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AGATCGCGGA TGGTCGCACG ATCCTCCGAC 
CATGGGCATC CCACCAGCAG CAGGCCTGTG 
5 TCTGACCCGC GCCTCGCXrCC YTATCAGTCC 
GAACAAGTTT CTGCTGTTTG AGGCACAGAC 
GGAGAGCACA AGAAGACGTC CTGAGCATTG 

10 

GGACCAGTTT GCCTCCTTCC CTGCCTTAAA 

crrrccxcTT tgcatgtgaa atactgtgaa 

15 TGCTTGAAAT GCACAGTGCA GCAATCTTCG 
ACAGTATCAT TCCAAATTCC AAGATCATCA 
TCAATGCXTG GAAGGATTTT TTTTAATCTT 

20 

TGATCTCATT GGGATAATGA GAAAAGCTAG 
CAAGGAAGAC AAAGAAAAAC AATGAAATCC 
25 AATGnCCTCC TTTTAAAAAA AAAAAAATGA 
ATATCCATTT AATGATTACA GTATTATTTT 
CTGAAAAACC AAATATGCCG GACAGGGTPGT 

30 

TTGTGACCCT GGCTTCCCAT GTCCTTCTGG 
TATGAAATGT TAGCCAATTA ATACCAAGAC 
35 GTTCTTCTGT AAAACTGTTT GCACATQGCC 
TCTGAGOCTT ATGGAGGCAG GACQpTGTCA 
GATGGCAAAC CCCArTTTTA 7W3TTATATTT 

40 

' m TGTT rn ' wmTriTC ttttttttta 

AGTGAAAGCA GCTTGGGATG TTGGAGCTAA 

45 AGCcrccxnT tattgaattg gcattaggga 

CAAAAACCTG GTTAGACATG CCAGCCTTTG 
CCAAGTGGCT TEATGGACXX: TGCATATAGA 

50 

CAGCTGCTAT TAACCCTATA ATGACTCAAA 
TGCACAGACT CCGGAAAAGT GAAGGCTGCC 
55 GCTGGTCTTG GATTTTTTTT CCATTAAATT 
TAAATAGCTT CAAATTTTAA AAGTGGAATT 
CAGTGCTTTA TTTAATAATT CTCTTCTGTA 

60 



401 

ACAGCTGGGC CTTCATCCX3T ACAQCAGCCA 720 

AACAAACCTC AGTGGCATAA AYCGAACGAG 780 
CAAGGGTTTT CCAOCGAGGA GGATGAAGAT 840 
TTTTCTGGAA GCAGAGCX3AG CTACCTGAAA 900 
GAGCCTTOGA ACTCACATTC TGAGGACGGT 960 

AGCAGCATC3G GGSTTCTTCT CXrCCTTCTTC 1020 

GAAATTGCXX: TOGCAdTTT CAGACTTTCT 1080 

AGCTCCCACT GrrTGCTGCCT GCCACATCAC 1140 

CAACAAGATG ATTCACTCTG GCTGCACTTC 1200 

CCTTTTAGAT TTCAATCCAG TCCTAGCACT 1260 

CCATTGAACT ACTTGGGGCC TTtAACCCAC 1320 

TTPGAGTACA GrTGCTTGTCC ACTTGTTTAC 1380 

GTTTAAAGAT TTTGTTCAGA GAGTAAATAT 1440 

AAACCTTAAG TAGGGTTGCC AGCCTGGrrTT 1500 

GGCCACACCA AGAAGACGQG AAGACCTGGC 1560 

TCTCAOXGC GAAGTGCCCT ATCCTGGAAG 1620 

ACCTCATCTG CTCCITCCCC AGTGGATGGG 1680 

AGGGGAGGGA ACTAGGACCC TTGTGTCCTG 1740 

rrCGCGGATG TGTCXrrGCTC CATTGAGATG 1800 

CTTTGATTTT TGrTTAATTTA GAGGTCTAGG 1860 

AGAGAAACAT TTATAACTGG ATAGCATTGC 1920 

TGCCAGCTGT TTATACTGCT CTTTCAAGAC 1980 

ATAAACAAGC CTTTAAACGT GATAAAAGAT 2040 

CAAGGCAGGT TAGTCACCAA AGACTAACCT 2100 

GAAGGCCTAA GTCTAGCAAC CATCTGCTCA 2160 

TGACCCCTCC ACTCTATTTT TGTGTTGTTT 2220 
AATCTGAGTA GTACTCAAAT GTGftGGAACT 2280 
CAGCTGATCA TATTGATCAG TAGATAAAOG 2340 
GCAGTOTTTT TTCACTGTAT CAAACAATGT 2400 
TCATGGCATT TGTCTACrrG CTTATTACAT 2460 



wo 98/54963 



PCT/US98/11422 



10 



402 



TOTCAATTAT GCATTTCTAA TTTTACATGT AATATGCATT ATTTGCCAGT TTTATTATAT 2520 

AGGCTATGGA CCTCATGTGC ATATAGAAAG ACAGAAATCT AGCTCTACCA CAAGTTGCAC 2580 

AAATOTTATC TAAGCATTAA GTAATTGTAG AACATAGGAC TGCTAATCTC AGTTCGCTCT 2640 

GTGATCTCAA GTGCAGAATG TACAATTAAC TGOTGATTTC CTCATACTTT TGATACTACT 2700 
TGTACCTCTA TGTCTTTTAG AAAGACATTG GTQGAGTCTG TATCCCTTTT GTATTTTTAA ' 2760 

TACAATAATT GTACATATTG GTTATATrTT TGTTGAAGAT GGTAGAAATG TACTATGriT 2820 

ATGCTTCTAC ATCCAGrTTTG TACAAGCTC3G AAAATAAATA AATATAACAT AAAAAAAAAA 2880 

15 AAAAAAAAAA 2890 



20 (2) INF0RMATIC3N FOR SEQ ID NO: 151: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LQIGTH: 2399 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEENESS : double 

(D) TOPOUOGY: linear 

(xi) SEQUENCE DESCRIPTIGN: SBQ ID NO: 151: 

30 GAAcrmcc ATCTGGCAAA CCGGAAACTC CATCCCCATT AAACCAACTC CCCCTTTTOG 60 

TTTCCCCCCC AGNGGAATAG AATTTGGACN CCXATATAAA TCCAGGAAAC CACCTAAATT 120 

CTTTAGTNGT TTGTGTTTGC AAGATCTAAG GTCATGCTTAA ACATTAAGTT CTTAAAATTT 180 

TTGGGAGGGA CCAGTGCACC TCTCCCTCTG AATTGTTCNC CAATTTAAAA TTGGAGTAAG 240 

OrrTTAAAAT GTCTNAITCC ATTGGAAGGG TNTGTTATTT CATTTTGAGC CCAGAGGGGA 300 

40 GAGGCACATT TTAAATATCA GAATTAGATT AGCTTTGAGT TTGrEACAATT GGGAACATAA 360 

TAGATTTTCA TAAATTATGT C?ltAX:r itJlT GGAAGTGTCA ACTCTCTTTA TGTCTGCTTG 420 

TAAAAGTTTC AAAATATGTT TTCCCTCAAA AAGGCAACGT TACTTCATTT GCTTGAATAT 480 

TATGATAGGA ATGCTTACTG ATATTACTTG ATAGTCATAT ATAGCCTAGG AAATTTAACA 540 

TATATATAAC TATAGCAGTA TTAATAATGA TAGTTGTACT TCTTTAAAAC ATTAAATTTG 600 

50 AGGAAACTTT AATGCTGTCT CGrTGTACATT GCTTTACTAC AGTGAGGGGG AATATCCTTT 660 

AGATTCAGCC TCAATTTACT GGTTAGTAGT ATGTGAACTC TGGTATAAAA ACGTAAACTA 720 

GACAGTAGAG COGATGAATT AAAATTGTAA ATTGCTACAT TGGCATTTTC TACCTCCTTT 780 

55 

TCTGTCAGAG TATTACTTTT TCCAGCATTT ATTCTTATTT GTGAGTAAAG AGGAAATGGG 840 

AACCTCAGGT TAAAATTGAC ATTTTTGTTT CATTGAGAAT TTAAGCAGTA GGTACAGGfiG 900 

60 AAGTGACTTG TCACATTAAT TTGGrTGCCTA AATCTGTAAC TACAAGITGT GATCGACATG 960 



35 



45 
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TACAAAATGT CTAAGAAAGG TCATATGCTG AATATTTTAC TTTTCCTGTA TAGTCTGCAT 
GATTTGTTTC ATAAACCCAG CTTATTTCCT CCAAAAAGCA AAATGGTCCT GTAATTITTA 
AAGTAAAATA AACGTGCCAT TTTGTCTGCA ATCTATAATT TCAGGAAGTT ATTGRAAGTT 
CTGACTCAGG GCTTTrrAAC AGriTCAAGCA ATTGTCAGrC ATATTTTGGA AACTCCATCT 
GrrGTAATTCT CCAGTGCCTT GAAAGAATTA TTAACTTGGC AACACTATTA AAACTTTATA 
AAAGATGGTC TTTAGTGCAC GTGTATCATT ATATACACXTT TTTAAAGTCA TATTGCTTAG 
CTTGTTAATA ATCATTCTGC ATGTCTGCTG GGTTTGGGTA ATTCTTTAAA GGAAGmTC 
TAGATTTCJCA CTTGATGrTTT CJITTTTTAAA AACTGATTAT TTATGQCCGT GACACTGTTA 
CCAGAAAACT AATTCTAATT AAGTTATTAT GCAAAGTCAT CTATAACTAG CATCTGGGAA 
GAGGAGATSG AGGCCACAGT TTGCTATTTT AGTATGAAAG GAGGATCTGT TTGGGAAACA 
TAGATTCTCT TCCCCTCAAA TGAGQQGAAA AAAAAAGACC CTTTGTTCAA ATCGATTCTG 
TTGTAAAAAA TTATTTTTAA AGGAAATCAC AAATTGTATG TCATTCTTAA TGCTAGTCTT 
ATAGAATAAA TCCATAAAAT TGrTTTTEATG TTCAGTATGT TTATGTCATT CTAAATGCAG 
CAAATTCAAT GATAGCAGTT CAATTGACTC ATAGCAGrTGT TTTGTATTTT TTCTAATTCT 
TTAGCTTTCA ATATTGGATT AAAGTCTTGT TTGTGAATAT AGTTTCCGTA TGGCAAATGA 
TTTCTTGCTT ArTAGCTTTT ffTTAAAGAAT GCTTAGTAAG AQCTAAGCTT TTAAAACTAA 
TGCAAACATT TATCGrTTAAT AAAACCTATG GTGTAATATC ATATAATGCT TTTCTTTGAT 
CTTTGGAGAA TEATTCTTTT ATAGTAGTAT ACATGAATTT TGATTTTTAA AGCATTTAAA 
AACAAATCTC AATACATTAA AAAACCTGTT ATTGTTAAAA RGGAAATTAC CATGCCTTTA 
AGAAACAAGG ATGTACATCT TCAATTCAGC ATRAGTGTCC ACATCTAGAA GGCTCTCATT 

GCAGrrTGrrrr ACAGrrrAAGG tacctctatc taaagggcca aagaagcatt tcatayttta 
acacctcaca ttctttcagg attaagacat atcaaaatag tctgaatagg ataaatttgg 
ataggaagta acttaaccag tctoggaaga ttcaggcttt ttctatkaaa aagcttattc 
ctcttcacaa ctcnggtggt aggwtttcat ttttcaagag ggtagatatt ttaaagcca 

(2) information for sbq id no: 152: 

(i) SBC^JENCE CHARACTERISTICS: 

(A) LEMGTH: 802 base paixs 

(B) - TYPE: nucleic acid 

(C) STRANDEENESS: double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 152: 

CGTGCCTOTA GTAAGCTCAT CCCTGCCTTT GAGATOGTGA TGCGrTGCCAA QGACAATGTT 60 

TACCACCTGG ACTGCTTTGC ATGTCAGCTT TGTAATCAGA GATTNTGTGT TGGAGACAAA 120 

TTTTTCCTAA AGAATAACWT GAYCCTTTGC CARACGGACT ACGAGGAAC3G TTTAATGAAA 180 

GAAGGTTATG CACCCCMGGT TCGCTGATCT ATCAACATCA CCCCATTAAG AATACAAAGC 240 

ACTACATTCT TTTATCTTTT TTGCTCCACA TGTACATAAG AATIGACACA GGAACCTACT 300 

GAATAGCGTA GATATAGGAA GGCAGGATGG TTATATGGAA TAAAAGGCGG ACTGCATCTG 360 

15 TATGTAGTGA AATTGCCCCA GTTCAGAGTT GAATGTTTAT TATTAAAGAA AAAAGTAATG 420 

TACATATOGC TGGATTTTTT TGCTTGCTAT TCGTTTTTGT GTCACTTGGC ATXSftGATGTT 480 

TATTTTOGAC TATTGTATAT AATGTATTGT AATATTTGAA GCACAAATGT AATACAGTTT 540 

20 . 

TATTGTGrrTA CCATTTGTGT TCCATTTGCT YCTTTGTATT GTTGCATTTA CTTACAATCAG 600 

TGTTTAAACT TACTGTATAT TTATGCTTTC TGTATTTACC AGCTATTTTA AATGAGCTGT 660 

25 AACTTTCTAG TAAAGAATTG AAAAGCAAAT CCTCACTAAA GGATACACAG GATAGGATAA 720 

AGCCAAGTCN CATCAACATT AAAAAATACT AAAANANAAA ACACAAAAAA AAAAAANCCC 780 
GQGGGGGGCC CGGAACCCAT TC 

30 



35 



55 



(2) INFORKftTION FOR SEQ ID IK>: 153: 



802 



(i) SEQUENCE CHARACTEEHSTICS : 

(A) I*ENGrrH:*461 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: double 
40 (D) TOPOUOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153: 

CTAGGAGCAC CGAGCAGCTT GGCTAAAAGT AAGGGTGTPCG TGCTGATGGC CCTGTGCGCA 60 

45 

CTCACCCGCG CTCTGCMCTC TCTGAACCTG GCGCCCCCGA CCGTCGCCGC CCCTGCCCCG 120 

AGTCTGTTCC CCGCCGCCCA GATGATGAAC AATQGCCTCC TCCAACAGCC CTCTGCCTTG 180 

50 ATCTTCCrcC CCTGCCGCCC AGTTCTTACT TCTGTGC3CCC TTAATGCCAA CTTTGTGTCC 240 

TCGAAGAGTC GTACCAAGTA CACCATTACA CCAGTCAAGA TGAGGAAGTC TGGGGGCCGA 300 

GACCACACAG GTOGGAACAA GGACAGGGGG ATTTAAGCAG TCAAAAGGAA AAACATGTTA 360 

AGACCCTAGA CTTGrTATATT GACACACTTG TACCTTGTAA GGCAGAGGAA TCTAATTAAA 420 

AAGCACTTAT TTGGCWNAAA AAAAAAAAAA AAAAAAAAAA C 461 



60 
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(2) INFORMATION FOR SEQ ID NO: 154: 

(i) SEQUENCE CHAHACTERISTICS : 

(A) LENGTH; 2388 base pairs 

(B) TYPE: nxicleic acid 

(C) STRANEEEENESS: doiible 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTICN: SEQ ID NO: 154: 
GCCCflCGCCT CCGAAAGCGG AGAACGCTGG TQGGCCTGTT GTGGAGTACG CTTTCGACTC 
15 AGAAGCATCG AQGCTATAGG ACGCflGCTGT TGCCATGACG GCCCAGGGGG GCTayTCGCT 
AACCGAGtSCC GQOGCTTCAA GTOGGCCATT GAGCTAAGCG GGCCTGGAGG AGGCAGCAGG 



BEST AVAILABLE COPY 



60 
120 
ISO 



GGTCGAAGTG ACCQGGGCAG TGGGCAGGGA GftCTCGCTCT ACCCAGTCGG TrACTTGGAC 240 



300 



AAGCAAGrrCC CTGATACCAG CGTGCAAGAG ACAGACOSGA TCCTGGTQGA GAAGCGCTGC 

TGGGACATCG CCTTQGGTCC CCTCAAACAG ATTCCCATGA ATCTCTTCAT CA3CTACATC 360 

GCAGGCAATA CTATCTCCAT CTTCCCTACT ATGflTGGrTGT GTATGATGGC CTGGCGACCC 420 

ATTCAGGCAC TTATQGCCAT TTCAGCCACT TTCAAGMCT TAGAAAGTTC AAGCCAGAAG 480 

TTTCTTCAGG GTTTQGTCTA TCTCATTGGG AACCTGATQG GTTTGGCATT GGCTCTTTAC 540 

AAGTQCCAGT CCATGGGACT GPTACCTACA CATGCATOGG ATTGGTTAGC CTTCATTGAG 600 

CCCCCTGAGA GAATGGAGTT CAGTGGTGGA GGACTGCTTT TGTGAACATC AGAAAGCAGC 660 
35 GCCTQGTCCC TATGTATTTG GGTCTTATTT ACATCCTTCT TTAAGCCCAG TGGCTCCTCA 



720 



GCATACTCTT AAACTAATCA CTTATGTTAA AAAGAAOCAA AAGACTCTTT TCTCCATGGT 780 
GGGGTGACAG GTCCTAGAAG GACAATGTGC ATATTAOGAC AAACACAAAG AAACTATACC 840 
ATAACCCAAG GCTGAAAATA ATGTAGAAAA CTTTATTTTT GTTTCCAGTA CAGAGCAAAA 900 



CAACAACAAA AAAACATAAC TATGTAAACA AGAGAATAAC TGCTQCTAAA TCAAGAACTC 960 

45 TTGCAGCATC TCCTTTGAAT AAATTAAATG GTTGAGAACA ATGCATAAAA AAAGITOCAC 1020 

AAGTTCCTTA TTTTCCTTAA TATTTCACTT CTATTTAATA CAAQCTGGGA CATAAAAATT 1080 

CTGTTGGGGA TACCTGGGGG AAGATGTCAG AAACTAATGC TGAATTCAGC TTATACA3X5A 1140 

TGAAAAGAAA AACCAGACAA AAGGAGCACA TAAATATGCA TACAGTGTAA CTCTTATTAT 1200 

TTTAATACCC ACGATAAGGG ATTTTTGITA GCATGTTTAG GGGGAAOGAG GATTGGTCGG 1260 

55 ATCCTTGGGG CCACAGGAAT CTGAGGCAAC QGAAGATATA TAGAGTGATC GTCXXCCIGC 1320 

CGAAQGAACC TQGCAYCTGT CAAGCAGATG CTGCAGTTCA AACTTCAGCT TTTAAGATAG 1380 

ATAGCTATTG AAGGCAGAGG GTCAGCAGGA GGATGTGTAT - TTCTAATCTA OXTCGTAAA 1440 
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GTCATAGGTA AGACTCAAAA GCX3QGATCTT 
TGTCTTGAAA TAGCCCCTTC CXCTAAGGTG 
5 TTCCAGTGAT TAAAAGAGAT GAGAGACTTT 
ACATGAAATA CTCTAGACAG AGATGAATAT 
AACAGTGATT TTGTGrTTTCG GGCTGAftSCA 

10 

ATCCCTTTAA AAGATTTTTA CAATTCTCCA 
ACTTTCTGCC CATAATTTGT TCTACATGGA 
15 GTGTAAATGT GGCAGAKTTC CTAGGCAGGC 
TGGATCCTGG TTGQGCAACA AGTGTCACGC 
GTTGGCTrGG CTAATCTOGT AGTTCAGGGC 

20 

ATTGAAAGAA AAAGGGQGAA ATGCTTATAC 
ACTATGGCTT TGTAGATGAG GCAAAAGATT 
25 ATC3CCAATCT GTATGCCATT TTACTAAAGT 
GGCACTAAAG AAAGAGTGTG GCTCTAGAAC 
AAAGATGGTC CAGTGCTTTC AGGGAAGGAT 

30 

TAAGATTTTT TGACCTGTGC TTAATAAGAC 



406 

ATTCAAAAGG CAGGTATTTC CTTTGTTTTC 1500 

CATTCTCTCA AGTTTTCAGT ATTGCTTTAT 1560 

GGAGACAGAC AACGTAAGCA ACACATACAC 1620 

AAATCTGGCC TAATAACCAG TTTTCCATGT 1680 

GTGGTTATAT TAAAAGCCAC TAATTCCCTT 1740 

ACCACAAACA GCACTTCTAA AACTAACTTT 1800 

AAAAAAAAAT ATTACTTTGG CCAGGGGTGT 1860 

TGACCTTTAC AGTATGGGCC TTTAAGATAC 1920 

CTGAAGTTrC TGAAAACAAA TTAGAAGACT 1980 

CAAGrrrrcTG TAGTCAGAAT GAAGAATAAA 2040 

TTGOCATTAA GTTGAATGCC TCAAGTCTTA 2100 

TCTTAGTOGT AAAATTTCTT CAACAGGTCA 2160 

AGGTAAGGAG AGTAGCCGCT CAGTAACTTT 2220 

TTCCAATCCC ATTGCTAGAT GTGCCCTTTA 2280 

GrrTTAGCCAG TTTTCCTAGT ATTTGrrTCCT 2340 

GGACQCGrrGG GTCGACCC 2388 



35 



45 



55 



(2) INFORMATION FOR SEQ ID NO: 155 



(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 642 base pairs 
40 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 155: 

AAAACAGACC ATTTAAAAAC TCftGACAAGA TTATATTTAA TATATTAATT ACTAAAAAGG 60 

CACAAGATTA CACTCAACAT ATTAGCTACT AAAAAGGCAC TGCTAAGftCA TTCAAGCAAA 120 

50 TAGCTATTAC ACACTACTGC AGATTTTACA GGTTTCTAAT TCTAACATAT GTTTGAAAAA 180 

TCCGTCACTA TTCCAAAATA TATITAATAA TGGAATATCT GCATTAATAT ACCATCCATG 240 

TCTTTTTACC ATTTGCCTTA ATATTGAATA TACTGTTTAC CTCACACTAA AAAGAAAACC 300 

AGAAGCCTTA TTTCTCATTT TGGGAGTGGA AGCTTCCATT TTTGrrGTCAA AAATGAATCC 360 

TCATTCTTAT GGAAATCTCT GTTATTAAGA TATTTCAAGA TGAGACAACA CTGAAGflTCA 420 

60 AAlTCTGrrr AGTATCACTA TCTTCTCTCC TCGTITCTCT CTTACTCCTC ATCCrCCCAG 480 
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407 

AAOXTTACCAG TTTATGGTAG AAAGATGGGA ACCTTATTTG AATGTGTrTT TTTTTTTCCA 
TGATGTCCAA TTTTGTTGTG GGAAAGGATT TGGATAAAAT rrTTGTTTAA ATTTTGGTAG 
ATTrTTATCT ATACAAATTT AAATAAAATT ATGTTTTGTA AG 

(2) INF0RMATIC3N FOR SEQ ID NO; 156: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1251 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: double 

(D) TOPOLOGY; linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID IK): 156: 
GCCGCTOCCC CrcCACGGAG TTGCTGATCA TCTQGGCTGT GATCCACAAA CCCGGTTCTT 
TGTCCCTCCT AATATCAAAC AGTGGATTGC CTTQCTGCAG AGGGGAAACT GCACGTTTAA 
AGAGAAAATA TCACGGGCCG CTTTCCACAA TGCAGTTGCT GTACTCATCT ACAATAATAA 
ATCCAAAGAG GAGCCAGTTA CCATGACTCA TCCAGGCACT GAGCATATTA TTGCTGTCAT 
GATAACAGAA TTGAGGGGrTA AGGATATTTT GAGTTATCTG GAGAAAAACA TCTCTGTACA 
AATGACAATA GCTGTTGGAA CTCGAATGCC ACCGAAGAAC TTCAGCCGTG QCTCTCTAGT 
CTTCGTGTCA ATATCCTTTA TTGTTTTGAT GATTATTTCT TCAGCATGGC TCATATTCTA 
CrrCATTCAG AAGATCAGGT ACACAAATGC ACGCGACAGG AACCAGCGTC GTCTCGGAGA 
TGCAGCCAAG AAAGCCATCA GTAAATTGAC AACCAGGACA GTAAAGAAGG GTGACAAGGA 
AACTGACCCA GACTTTGATC ATTGTGCAGT CTGCATAGAG AGCTATAAQC AGAATGATGT 
CGTCOGAATT CTCCCCTGCA AGCATGTTTT CCACAAATCC TGCGTGGATC CCTGGCTTAG 
TGAACATTGT ACCTGTCCTA TGTGCAAACT TAATATATTG AAGGCCCTGG GAATTGTGCC 
GAATTTQCCA TGTACTGATA ACGTAGCATT CGATATGGAA AGGCTCACCA GAACCCAAGC 
TCTTAACCGA AGATCAGCCC TCGGCGACCT CGCCGGCGAC AACTCCCTTG GCCTTGAGCC 
ACTTCGAACT TCGGGGATCT CACCTCTTCC TCAGGATGGG GAGCICACTC CGAGAACAGG 
AGAAATCAAC ATTCCAGTAA CAAAAGAATG GTTTATTATT GCCAGTrTTG GCCTCCTCAG 
TCCCCTCACA CTCTGCTACA TGATCATCAG AGCCACAGCT AGCTTGAATG CTAATGAGGT 
AGAATGGTTT TGAAGAAGAA AAAACCTGCT TTCTGACTGA TTTTGCCTTG AAGGAAAAAA 
GAACCTATTT TTCTGCATCA TTTACCAATC ATGCCACACA AGCATTTATT TTTAGTACAT 
TTTATTTTTT CATAAAATTG CTAATGCCAA AGCTTTGTAT TAAAAGAAAT AAATAATAAA 
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ATAAAAAAAA AAAAACCCCX5 GGGGGGGCCC GGTCCCCAAT TGGCCCTATG G 1251 



(2) INFORMATICai FOR SBQ ID NO: 157: 

(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 2127 base pairs 
10 (B) TYPE: nucleic acid 

(C) STRANDEENESS: double 

(D) TOPOLOGY: linear 



15 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 157: 

CCGGCGGGAG AGGGAAGCTC CAGCGAGAGG OGCGGATCTC AGCGCGQGAG CAGTGCTTCr 60 

GCGGCAGGCC CCTGAGGGAG QGAGCTGTCA GCCAGGGAAA ACCGAGAACA CCATCACCAT 120 

20 GACAACCACyr CACCAGCCTC AGGACAGATA CAAAGCTGTC TQGCTTATCT TCTTCATGCT 180 

GGCTTCTOGGA ACGCTCCTCC CGTGGAATTT TTTCATGACG GCCACTCAGT ATTTCACAAA 240 

CCGCCTCGAC ATCTTCCCAGA ATGrrGTCCTT GGTCACTGCT GAACTGAGCA AGGACGCCCA 300 

25 

GGCC?ICAGCG CNCCCTCCAG CACCCTTGCC TGAOCGGAAC TCTCTCAGTG CCATCTTCAA 360 

CAAICTCATC ACCCTATGTG CCATGCTGCC CCTGCTGTTA TTCACCTACC TCAACTCCTT 420 

30 CCTGCATCAG AGGftTCCCCC AGTCCGTACG GATCCTOGGC AGCCTGGTGG CCATCCTGCT 480 

GGrcrrrcTG atcactgcca tcctggtgaa qgtgcagctg gatgctctgc ccrrcmCT 540 



CATCACCATG ATCAAGATCG TGCTCATTAA TTCATTTGGT GCCATCCTGC AGGGCAGCCT 600 

GrrK3c?rcTG GCTGGCCTTC TGCCTGCCAG CTRACACGGC CCCCATCATG AGTGGCCAGG 660 

gcctagcagg crpcrrrccc TcayrcGccA tgatctgcgc tattgccagt ggctcggagc 720 

40 TATCAGAAAG TCCCrPC3GGC TACTTTATCA CAGCCTGTGC TGrrKATCATT TTGACCATCA 780 

TCTCTTACCT GGGCCTGCCC CGCCTQGAAT TCTACCGCTA CTACCAGCAG CrCAAGCTTG 840 

AAGGACCCGG GGAQCAGGAG ACCAAGTTGG ACCTCATTAG CAAAGGAGAG GAGCCAAGAG 900 

CAGGCAAAGA QGAATCIGGA GTITCAGTCT CCAACTCTCA GCCCACCAAT GAAAGCCACT 960 

CTATCAAAGC CATCCIGAAA AATATCTCAG TCCTGGCm CTCTCTCTGC TTCATCTTCA 1020 
50 CTATCACCAT TGGGATGTTT CCAGCCGTGA CTGTTGAGGT CAAGTCCAGC ATOGCAQGCA 



1080 



GCAGCACdG GGAACGrrrAC TTCATTCCTG TGTCCTGTTT dTGACTTTC AATATCTTTG 1140 

ACTQGTTOGG CCQGAGCCTC ACAGCTGTAT TCATGTGGCC TGGGAAGGAC AGCCGCTGGC 1200 

TCCCAAGCTG GWroCTGGCC CGGCTGGTGT TTGTGCCACT GCTGCTGCTG TGCAACATTA 1260 

AGCCCCGCCG CTACCTCACT GTGGrrCTrCG AGCACGATGC CTGGnTCATC TTCTTCAIGG 1320 

60 CTCCOTPTGC CTTCrcCAAC GGCTACCTCG CCAGCCTCTG CATGTGCTTC GGGCCCAAGA 1380 
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AAGTCAAGCC AGCTGAGGCA GAGACCGCAG AGCCATCATG GCCTTCTTCC TGTGTCTGGG 1440 

TCTCGCACTG GGGGCTGTTT TCTCCTTCCT GrTTCCGGGCA ATTGTOTGAC AAAQGATGGA 1500 

5 

CAGAAGGACT GCCTGCXTTCC CTCCCTGrCT QCCTCCTGCC CCTTCCTICT GCCAGGGGTG 1560 
ATCCTGAGTG GTCTGGCGC3T TTTTTCTTCT AACTGACTTC TGCTTTCCAC GGCXJTGTGCT . 1620 

10 GQGCCCGGAT CTCCAGGCCC TGGGGAGGGA GCCTCTGGAC QGACAGTGGG GACATTOTGG 1680 

CTITCGGGCT CftGAGTCGAG GGACGC3GGTG TAGCCTCGGC ATTTGCTTGA GTTTCTCCAC 1740 

TCTTGGCrcT GACTGATCCC TGCTrGTGCA GC5CCAGTCGA GC3CTCTTGGG CTTGGAGAAC 1800 

15 

AOTKTKTICT CTOTCTATOT GTCTGTGrrCT CTGCGTCCGT GTCTGTCAGA CTCTCTGCCT 1860 

GTCCTCGGGT GGCTAGGAGC TGGGTCTGAC OSTTGTATGG TTTGACCTGA TATACTCCAT 1920 

20 TCrCCCCTGC GCCTCCTCCT CTGTGTTCTC TCCATOTCCC CCTOXAACT CCCCATGCCC 1980 

AGTTCTTACC CATCATQCAC CCTGTACAGT TGCXACGrTTA CTGCXTTTTTT TAAAAATATA 2040 

TTTGACflGAA ACCAGGTCCX: TTCflGAGGCT CTCTGATITA AATAAACCTT TCTTGTTTTT 2100 

TTCTCCATGG AAAAAAAAAA AAAAAAA 2127 



25 



30 



40 



60 



(2) INFORMATION FOR SEQ ID NO: 158: 



(i) SEQUENCE CHARACTERISriCS : 

(A) LQJGTH: 1625 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRAN0EENESS: double 

(D) TOPOLOGY: linear 



(xi) SBQUQ3CE DESCRIPnC»I : SEQ ID NO: 158: 

CAAAAGATCT ATAATCAGGA CATTGTTTAT GTAAGTTGGA CAANAAAAAT TCTICCCCTT 60 

TATGTCCACC CTTCCTATGA TPGCAAGACA AAATTTCCCT CCTTTACCTC ATCCCTATAA 120 

45 CATTCQGAGGC TGAGAAAAAT GAGGGGAGAT QGAACCAGAT ACAAGGAGAT CCAATAAGAG 180 

AAGCTTATTT AAATATTGTG AAATAAAGGA AGAMCCAAAG CATTTTTTTA AGTGGGGAAT 240 

CCTTTTCAAC AGTTATTATT TATCCATATT ATTAAYAACA TCTTTTCTGA CAAAATCCAT 300 

50 

CAGATCAAGT GTAAATGGAT AATCTTTTAA TGGATCTAAA CCTAGAAAGT TTCACTTACT 360 

GITCATGTCC GrTGTTCCAGA ATTCTGAAAT GGrTGrrGTGGT TTTGCrrTCC AAGTTCTTCT 420 
55 CTQCCTCCTC TTAATTCTCT AATTCCATGT CTTACAGAAG AATGAGAAAT TTCTTTCTTA 



TTATATTGAA GCCAACTGCC TTTAATTCTT GGGCCCTCTT ATATTTTTAA GGTGCAAAAT 



480 



CTTGAGTATC ATCCTCTAAA AAACTTGGCT TCACrTCACAG AAACGCTQGC TCTCCTGTGC 540 



600 
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TTGAAGTCTC AGTCACCAGA CACAGGTTCT ATACAATTAA TGATGAGCTG GAGAAGTAAT 660 

ATGTAGCTAA TTTTTCAAAA GCATTGAATA TACTTTTCCGG AAAGAAAACA GAAATTAAAT 720 

5 ATTGCCACAT CTTGCCAGAA TCCCATCTGA CACCTTAACT TTGTCAQGTT TCCTACAACT 780 

TGCTAAICAA GTTTTATACA TTCTAAATCT CCCCAGTTTC TTTOGGGCTG GAAGATGCAA 840 

CTTCCATTTA ATAGAAACTT TGAAATCTTG GGGTAAGGGA GCAGTGQGGG GACTAGGGAG 900 

10 

AAGGATAAGA AATAGAATTA TTGAAA/VSCC CCCACCAGGG ACCTTCCTGG CCAGAATATG 960 

CAGAGTAATT CCTGCTGGCT TCACCTTTGA AAGTCCCTCG AAACTATGCA GATGAAACTG 1020 

15 AGnXTOTTTT TGATATTGTC AGATGTATTC TACCTTGGAA CJTCCCNACAC CTAAACTGGA 1080 

ATTCTTGTAT TTACATCTCC TCCACTGTCC CCCACACCAC CCCTCAATTC CTGCTGCXXC 1140 

TGCTAATGTT AAGCATTTTT CTCTTGTTAT CATCAGGrTTC ACATTAAAAM CAGRTACTTA 1200 

20 

CAAACTGACT TGAftGCACAG ATACmTAC GAATGTGATA AAATATTTTC TTAAGAAAftG 1260 

QAAAGAGGAT GTGGGTCAAA TAAAACACXG CATGGAICTT GATTGGTGAA TACTQGTOTA 1320 

25 AGAAAAGGGA GCTCAGGAAT TTTTATTACT GrTATTTGTAA A3X3AGTTTGA AGGAATTTGT 1380 

AAATGCXZACT QGTACATTTT TAAGGTGACA CATTTGCTCC TTATAAAGTT ATTAAAAATT 1440 

ACAGGGTAAG CTTAAATGAC GTTTGCCAGT AGTTTTACTT TATATAATCA ATATTGATAT 1500 

30 

TGTTGCTGAA CTATGTAACT TTATGATGCA TTTTTCAGTC CCTTTTCAGA GCAAATGCTT 1560 

TTGCAATGGT AGTAATGTTT AGTTTAAATT GACTTAATAA ATTMrTACCT GAGCAAAAAA 1620 

35 AAAAA 1625 

40 (2) INFORMATION TOR SEQ ID NO: 159: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1687 base pairs 

(B) TYPE: nucleic acid 
45 (C) STRANDEENESS: double 

(D) TOPOLOGY: linea r 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 159: 

50 CGGGGTCACC AGTTATTAGA GGAAGTAACA CAAGGQGATA TGAGTCCAGC AGACACATTT 60 

CTGTCCGATC TGCCAAGGGA TGATATCTAT GTGTCAGATG TTGAGGACGA CGGTGATGAC 120 

ACATCTCTGG ATAGTGACCT GGATCCAGAG GAGCTGGCAG GAGTCAGOGG ACATCAQGGT 180 

55 

CTAAGGGACC AAAAQCGTAT GOGACTTACT GAAGTGCAAG ATGATAAAGA GGAQGAGGAG 240 

GAGGACSAATC CACTGCK3GT ACCACTGGAG GAAAAGGCAG TACTGCAGGA AGAACAAGCC 300 

60 AACCTGTC3GT TCTCAAAGGG CAGCTTTGCT GGGNATOGAG GACGATGCCG ATGAAGGCCC 360 
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TGGAGATCAG TCAGGCCCAG CTGTTATTTG AGAACCGGYG GAAGGGACGG CAGCAGCAGC 
AGAAGCAGCA GCTGCCACAG ACACCCCCTT CCTGTTTGAA GACTGAGATA ATGTCTCCCC 
TGTACCAAGA TGAAGCCCCT AAGGNAACAG AGGCTTCTTC GGGGACAGAA GCTQCCACTG 
GCCTTGAAGG GGAAGAAAAG GATGGCATCT CAGACAGTGA TAGCAlGTACT AGCAKTGAGG 
AAGAAGAGAG CTGQGAACCC TOCGTGGTAA GAAGCGAASC GTGGGCCTAA AGTCAGATGA 
TGACGGGTTT GAGATrAGTGC CTATTGAGGA CCCAGCGAAA CATCGGATAC TGGACCCCGA 
AGGCCTTGCT CTAQGTGCTG TTATTGCXTTC TTCCAAAAflG GCCAAGAGAG ACCTCATAGA 
TAACTCCTTC AACXGGTACA CATTTAATGA GGATGAGGGG GAGCTTCCGG AGTGGTTTGT 
GCAAGAGGAA AAGCAGCACC GGATACGACA GrrTGCCTGTT GGTAAGAAGG A3GTQGAGCA 
TTACCGGAAA CGCTGGCGGG AAATCAATGC ACGTCCCATC AAGAAGGTQG CTGAGGCTAA 
GGCTAGAAAG AAAAGGAGGA TGCTGAAGAG GCTGGAGCAG ACCAGGAAGA AGGCAGAAGC 
CGTGGrTGAAC ACAGTQGACA TCTOCAGAAC GAGAGAAAGT GGCACAGCTG CGAAGTCTCT 
ACAAGAAGGC TGGGCTTGGC AAGGAGAAAC GCCATGTCAC CTACGTICTA GCCAAAAAAG 
GTGTGGGCCG CAAAGTQCGC CGGCCAGCTG GAGTCAGAGG TCATTTCAAG GTGGrTGGACT 
CAAGGATGAA GAAGGACCAA AGAGCACAGC AACGTAAGGA ACAAAAGAAA AAACACAAAC 
GGAAGTAAGC AGAGCTGCCA GGCTCCCAGG AGAQCATGGG GACTAGGAQG AAGGGTGTGG 
CATGGCTCAG TCTGGCCCCC TTGATTACCG GCCTAGCCCC TGCTCACATC ACAGCTCTCT 
GAAGAACAGT GAGGTCGAGT GCCTAGAACT CCCGTGGTrGG TCCTGAGCAG AGAGGAGGAT 
GTCCTCCTGC CTGCCTGAAG GTCTCCCATG AAAACACTGC TGAACTGTGT TGACACTCAT 
GACCCTTTTT TTAAACOGTT AAAGGGAAGT TCGGTGTTGG AGCGATACTC AATGrTAGTCA 
GTCTACACCT GGACGriCTGG GCCACTTAAG CXXTTCCCCAC CCCCATCCTA TTCCTRAATA 
AAACCAGGAT AA3X3GAARAA AAAAAAAAAA AAAAAAAAAG QGQGGGCCCN TAAAGGGWCC 

CANtrrrr 

(2) INFORMATION FOR SEQ ID NO: 160: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1842 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: doxible 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 160: 
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GGATGACAGA TTGCGACANA GATTTGTGAC CCTTCCTCCT GAACTTCAGA GGGAGCTGAA 60 

ANCAGCGTAT GATCAAAGAC AAAGGCAGGG CGAGAACAGC ACTCACCAGC AGTCAGCCAG 120 

CGCATCTGTG CCCCGAGAAT CCTTTACTTC ATCTAAAGGC AGCAGTGAAA GAAAAGAAAA 180 

GAAACAAGAA GAAAAAAACC ATTQGTTCAC CAAAAAGGAT TCAGAGTCCT TTGAATAACA 240 
AGCTGCTTAA CAGTCCTGCA AAAACTCTGC CAGGGGCXTPG TGGCAGTCCC CAGAAGTTAA ■ 300 

TTGATGQGTT TCTAAAACAT GAAGGACCTC CTGCAGAGAA ACCCCTGGAA GAACTCTCTG 360 

CTTCTACTTC AGGTOTGCCA GGCCTTTCTA GTTTCCAGTC TGACCCAGCT GGCTGTGTGA 420 

GACCTCCAGC ACCCAATCTA GCTGGAGCTG TTGAATTCAA TGATGTGAAG ACCTTGCTCA 480 

GAGAA3X3GAT AACTACAATT TCAGATCCAA TGGAAGAAGA CATTCTCCAA GTTGTGAAAT 540 

ACTGTACTGA TCTAATAGAA GAAAAAGATT TGGAAAAACT GGATCTAGTT ATAAAATACA 600 
TGAAAAGGCT GATGCAGCAA TCGGTGGAAT CGGTITOGAA TATQGCATTT GACTTTATTC ' 660 

TTGACAAOOT CCAGGTGGTT TTACAACAAA CTTATGGAAG CACATTAAAA GTTACATAAA 720 

TATTACCAGA GAGCCTGATG CTCTCTGATA GCTGrPGCCAT AAGTGCTTGT GAGGTATTTG 780 

CAAAGTGCAT GATAGTAATG CTCGGAGTTT TTATAATTTT AAATTTCTTT TAAAGCAAGT 840 

GTTTTGTACA TTTCTTTTCA AAAAGTGCCA AATTTGTCAG TATTGCATGT AAATAATTGT 900 

GTTAATTATT TTACTGTAGC ATAGATTCTA TTTACAAAAT GTTTGTTTAT AAAGTTTTAT 960 

GGATTTTTAC AGTGAAGTGT TTACAGTTGT TTAATAAAGA ACTGTATGTA TATTTC3GTAC 1020 

RGGCTCCTTT TKGTQAAYCC TTAAAAACTC AACTCTAGGA RQCAACTACT GTTTATTATA 1080 

CTAAARGGCT GAAAAMCCTC CAGGOCAGAC TGCTAAGCTC TGAAATYCCT GAGAGGTCTC 1140 

AGACCGGGAT TCTACTTGTT CCAAGAAAGG GTAAAGCTTC TAAACCATCT TATTCTTOTC 1200 

TCCAAGCATG AACACAGGAG CATGTYAAGA AAATCTTTAC TACTTTCTyC CATGCGGAGA 1260 

AATCTACATA TTTTGAATTA GAAACACCCT CACACCCACT TGAAGATTTT TTTCCTGGGA 1320 

ACATTATGTC CCGTAGATCA GAGGTGGTTCT TGTCTTTTTG CTTCTACTGG CCAITGAGAA 1380 

ACTTTGATGA TAAAAAAGAA OSGIATAGAT TTTTCAAACG TATATAAAAT ATTTTTATGT 1440 

TATATGTTAT GCCATAACTT TAAAATAAAA ATAGTTTAAA AITCTATGCT AGTGGATATT 1500 

TGGAACTTTT TCCTCAAACA AACACCCCAC ACTGACTTCA GCAAAACCCT AAAACTAGCT 1560 

ACAGATTACT ACTACGAATG AATCATYAAG TTTTGTGTCT GCAACAATTT AGAAGCACTA 1620 

AGCCCAAATA TCAGGAAATG TGTGTATGAT GGAATTTTCT AGGACAAAAC AGATCAAGAT 1680 

TAAAACAGGA TCAAGGATTA ATGGTATAAA AATQGTCTAC TAAAACAGGA TCAAGGATTA 1740 

AAACAGGATC AAGGATTAAT GGTATAAAAA TCTCTACTGG TTACCXX3GTG GCNGGGCCAT 1800 
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ACAGQGTAGT GGTGGATGGA TAGTTTAGTT TGGWAAGGGT AA 1842 



5 

(2) INPORMATIGN FOR SEQ ID NO: 161: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LQICrrH: 770 base pairs 
10 (B) TYPE: nxicleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 161: 

15 

GGCACGAGCC CTATGCTGTT CTTGTGATAA TGAGTGAGTC TCACAAGATC TGGTGGTGTT 60 

ATAGGCATCT GGCATTTCCC CTGCTGACGC TCATTCTCTA TCCTGCCACC CTQGGAAGAA 120 

20 GTGTCTTCTG TCATGATrCT AAGTTTCCTG AGGCCTCCCC AGCTATGTAG AACTGTGAGC 180 

CAATTAAACC TCTTTTCTCT ATAAATTATC CAGTCTTATA TATTTCTTCA TAGCAGTGTG 240 

AGAACAGATA ATACCGTEAAA TTGGTATCAC AGAGAGTOGG GTGTTGCTAT AAACACATCT 300 

25 

GAAAATGTTA AAGCAAATTT GGAACTGGGT AACAGGCAAA GGCTGGAACA GTTKGAAGAA 360 

CAGTTAAGAA GAAGACAGGA AAATATGAGA AATCTTGAAA CTTCCTAGAG TCTTAAAGGT 420 

30 CTCAGAAGAC ATGAAGATGT GGGAAGCTTT GGAACTTCCT AGAGACTTGT TTGAATGGCT 480 

TTGACCAAAA TGCTGATAGT GATATGGACA ATGAAGTCCA GGCTGAGCTT ATCCAGACAG 540 

ACATAAGAAG CTCGCTGQGA ACTTGAGTAA AGATCACTCT TGCTAGGCAA AGAGACTGGT 600 

35 

GGCCmTTT CCTCTGCCCT AGAGATCTGT GGAAATCTGA ACCTGAGAGA GATGATTTAG 660 

GGTATCTGGC AGAAGAAATA TCTAAGCGGC AAAACCTTCM AGAGGAAGCA GAGCATAAAC 720 

40 GTTTGAAAAA TTTGCAGCCT GACNATQGGA GACCAAAGTT AAACCCAATT 770 



45 (2) INFORMATICN FOR SEQ ID NO: 162: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 519 base pairs 

(B) TYPE: nucleic acid 
50 (C) STRANDEEMESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID IK): 162: 
55 GAATTCGGCA CGAGCTGAGA GGCACAGGAG CAACAGCCAG TGCCCCCTGC AGAGGACCAC 60 



60 



TGGGGTCACA GACTTCARAC CTGATGACCT GGGCTCAGAT CCCAGCTCTG CACCTACCAG 
CCGTGTGACA AGGTGTCCTC TCTGAGCCTC AGTCACACAC TGCCTTAACG GTTGGGCCTC 



120 
180 
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ATGGAGCTGT TTGTGAAGGT TAAATGGGAA GACATAAAGC ACTTAGCCCA GAGCCAAGGA 240 

CATGCTGAAT AGGATAATGG TGGCCTCCTT TGGCGCTGTC CTGGTGCAGG TGTGCCGAGG 300 

5 AAYTGGGCAG GQGTGACAGA TACCTCTTCT AACCTAGTTC CTTTCCAAGA ACCTAATTOG 360 

TGTCTCTCCC TCCXXCAQGC AATTGGAAGG AGGAGGCTGG GCCCCAGCCC CAGAATACGG 420 

GAGGTTTCTC ACCX3TGGTAG GGAAATTGCT GGGTTGGGGG TGTGQGCAAC CACAGTCATC 480 

10 

GTCTCrCTGC AGGACGGATG AGGCTTTGCT GACAGAGGC 519 



15 



25 



(2) INFORMATION FOR SEQ ID NO: 163: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 753 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : doxjble 
(D> TOPOLOGY: linear 



(xi) SEQUENCE DESCRXPTION: SEQ ID NO: 163: 

GGCACGAGCG GCACGAGCAG CCAGrTTGCTG ACTGGCACAT GGCCTCCAGC GTCCCGQCTG 60 

GTGGGCACAC TAGAGCCGGA GGGATCTTCT TAATTGGTAA ATTGGATCTT GAAGCTTCAC 120 

30 TGTTTAAATC TrTTCAGTGG CTTCCCTTTG TACTTAGAAA AAAATGCAAC TTCTTCIGCT 180 

GGGACTCATC CGCTCACAGC CTTCCCCTCC ACCCTCTCTC TGCCTCATGC TCTGCCCCTG 240 

CCTGCCATGC CTCOGATACT CACCTTTTGT ACCCCAGCAC CCGTGCCCTC TGCCCCTCGA 300 

35 

TCTTTGCCTG GCTGGTTGCT CCTCACTCAG TGTTCAGGAC AAATGCTCCT GGCCCTACCC 360 

CATCTAGCCA GTCTAGCCCG GTCTTCCCTG TCTTCCCTGT TTCATTCATG GCTCTTATTG 420 

40 TTTGTTWACT TGTGTGCTGT TGACTTTTAA CTCTCTCAGT CCCCACTGGA ATGCAAGCGA 480 

TCTCCCAAGC TCCTAGAATT GriTCCTGCCT CTTCACAGGC CCTTACGCTG TGTGTGCTOG 540 

TGCCGAATTC GGCACGAGGG TATGTGCACT TGCTQGTATG TATGTAGGTC TTTGCTAACA 600 

45 

CATACGTGCA CACGCAGAAT GCTTCCAGGG GACTOCACAG CCTCTAGTTC GCAGCCCCCA 660 

CCCCTCCCTT TGSCCCTGCA CTCTCCCCTC TCTGAGCTGC ArTOGCATGA AAGQGTGCAN 720 

50 GGTTCCTGAN CCCGCNAGCG NCACCTOCTG GGA 753 



55 (2) INFORMATION FOR SEQ ID NO: 164: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1400 base pairs 

(B) TYPE: nucleic acid 
60 (C) STRANDEDNESS: double 
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(D) TOPOLOGT/: linear 
(xi) SEQUENCE DESCRZPTIC^I: SEQ ID NO: 164: 
GGCACAGTTT ATTAATACCT ATTATGGa--A iiGTCACTTTG C?rTGGCATTG AAAATTACAT 
CATCTTTAAA GCAGTATTTG TCCCCAG.-.'TG GACTCATCAC TAGCAAAGAC TAGGTTCATT 
GGAAGGCATA GGGTCAGAGA ATOGGAJ^GAT C^IAGTGGAGG CGGGTTGITA AAGTGCTGTC 
AGTGAGTGAT TTTGTCTACT TGAATAATGG TCCATOTTTG GGGGCATATT GTGTTTCATA 
AGAAGTGAAA GGTATTTGCA AACTTAAGCTA CAAATGACCC ATAAATCTGT TAACAACAGT 
CCTTAATATG CAAAGATGAA AAACAiUXAT TACTGCTACC CAAAGGGAAC TGGTGCTTGG 
TGATGTGCAG ATGGGGCTGT TGGTTAAGAG AGCTATTACA GGTrTTCTCT CTTAGGTTTC 
ATAGGAGGTA GTTACTGAGA TGAGATTGTT TTATCTTTTT GAATACAGAT CTCTTGTCTT 
GAGTTAGTTC TGAGGATGGG AGTAATAAAG GAGTTTTTTG TTTTTTTGTT TGTTTGTTTG 
TTTTGGCTCC TTAGTAATAC TCCTCTGACA TTTATTTCTA TTATTCTTCA AAGAAAGGAA 
ACCAACTGAA ATGTTTGCTT TAACAAACAT TTTAATAAGT TCTCTGGGTT TTTTTTTGCC 
CTTTTAAAAA AATTAGCATA TACCATAGCA ATAAAAGAAC TAATGTTAAC TATTGTATGC 
TACAACTTAA GTGATTTTTC TAAAGAACXrA CAATGTCATT C31AAGTATTA TTGAAAAGGA 
TCATAGTCAC ATTGAATTTG TGAAGGCCAA AGAAATTGAA GGGAGTGATA TTTTCATTTT 
ATGATATTCA CATATTTAGT AAATTrrGTG TACAAGAATA CCAGGCAGAG TGTTTTACCC 
ATQGAAACAG GTTTCAGATT ACTTTGTTTr TACTGTTAGA GTCTCAAGTT TAGAAATGCT 
AACACTTAAA TCAGriTTTTT TCTCACTATA CTTGAAGATT GTTAATATTT TGATATCTTC 
CTAGCTTGAT GGAATTTAAA CATATCTTCA GATCTGTGAC AGTGACAGCC AATAGGACTG 
ATAATATTAG CTTCAAACCA AIAATATCCA GGGTTAAAAT AAAAATCATA GTGAAAGTAC 
GATTGTAAAA TTATGCTATA TTAACITTTA AGTCTGrTAAT AACTTGACAT CAAAATGTTA 
TGTAATTACC ATAAATAATG GCTiUGCGAGA ACATCTTTGG AAATTCTCAA ATTACCTTTC 
TTACTACACT GrTTGCAGAA TGAATGTAGA AATGATCCTG TTAGCTITCr GAATGTTCTG 
TGGTTGAATG 'Ixyr rm U C T TAAAIAAAGC TrTTGGTATT TGTTTAAATW ACAAAAAAAA 
AAAAAAAAAA AAAAACTCGA 

(2) INFORMATION FOR SEQ ID NO: 165: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2153 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDECNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 165: 

5 

CAGGCCTCAG GGCCTCTGGT GGCTCTGGCC CAGACaCTAT TTGCAGTTCT TGTGCTATGG 60 

GTGGGAGTCT TCTTCCTCAA GTTTCGGCAG CTGTGCTGTC NCTGGATQGG CTGCTCCTCC 120 

10 CAGGGCTCAA GQGCTGTC3GT CCGCTCAGGG TCTCATTTCC CCAQGCCAAG TTCAAGQCAG 180 

CAGCCCTTTG TGAGGCGCTC TTGGCCCPGG GCTGGAGQGA GAACTTTAAG CTTTTTTGCT 240 

CACAGGGACG TQGTATQGGC CCTQGGTGCA GGTQCCXIACA TTCTGCTAAT GAGAGCTTTG 300 

15 

TCTGATCAGT CCTQGGTCCA TCAGTTTGTC CATGTGTCCG GCTGCXAGCX: CGTCCCTTOG 360 

GATCCTTCCC CTQGGGTCTA GCCTTCTTCA TTAGTATATA CTCArrCCTT CATGCTTTCC 420 

20 TCAGCAGAAC ACTTCCACTT CTGAGGTGAG CTTTTGOCCC FTGCCXTTTCC TCCACAOGTC 480 

TTCCCTTTTr ATAAAGACCT GATAGCAGAA TAAATTGGTG TTTCCCTGTr GACCCAGCAC 540 

CATTTCTGTG GGCXTTAGAAT ATCSGCCCTCA ACCCTTAGAG TQGGGCAGTG AGGGCTTGAG 600 

25 

GAGTGACCCT TCCTTTCTCA TQGTTTTAGT CATTTTQGCT GCCAGCCCTT AATGGCACAG 660 

ATCTGCTGCT TCTAACAGAT GGCCAGGAGG TGACACOGAT TTCAGCCATT GCCAAGGTTA 720 

30 GCACCCTCTC CTTTGAGCCT AGGGCCACAC TGTTCATTGT CACTTTAGGC AAGTTGCXrrGT 780 

TTGGCTTTAA AGGTAAQCCT GCCAGCTGTG AGAAGCCTTG GTAACTGATG GACTCATTTC 840 

CTGGTCCTTA AAGATGCAGC CTCTTAAGGG CTCCTTGATO GATGCCATCT CTCCTAGCCC 900 

35 

CCAGCCCTGG TGCCACTGGT GGGCAGGTTC CCATTCTTTG GGGCTQQGAG GGACAGCTTG 960 

CCTGTTTCTG GTCACAAATT ACAGTCTTCT CTCCTGlftCC ATTCTGTGGC TTCAGCATGG 1020 

40 GGGCAGTAGC CTTTCATTAG TGTAGATAGT CATTCCCTGG TAGGGTGGAG GGTAAGACAT 1080 

AGGGTCPGGA ACrGTTIGGG ACXTTTTTGOG GATGTCCTGT GCCTCCCAGA TTCCTMGATT 1140 

CTGGGAGGAG AGGCTGCCX3C ATTCTGCTGC TCCTCACAGC GAGCAAAGCT GCACCCACTT 1200 

45 

ACATTCAGTA TTTTCCTGGC ACTACAAAGA GTGGGAAGGC CTGGGATTTG CTGCTGCTCC 1260 

CTTAGAGCAG GGCCCCTYTT TTCAGCACTT TGGACACCTG GAGACCCAGC OCTGrrTATIT 1320 

50 AATGGrrAGTG GGCAAGTGTG TGTGCATACT GTCTGCCACT OCTTTCTCCC TGCXXCATOC 1380 

CAGAGAGCCrC TC3TCCCTGCC AGGCCCAGCX: TPCTTAGCCC CAACTTGGGA ACAAAGTGCA 1440 

ACATGGGATC ATGGGTTQGG GTGCTCAGGT CSAGCXXTPCTC TATAGTOCTT CCCTGQOCCA 1500 

55 

AGCTGACACC AGCCCCTGAG GGTGGGGrTGG GACGGGTGGT GCTTAAAAGA GGAAGGGGAC 1560 

CAGTGTAGCA ACTTGCCAGG GACCCCACCX: CTCCCTCTCT GGGCCrGTOC AGTGAGCATG 1620 

60 GGGATTCCCA TCAAGGGGCC TGGCACCTGT GCTAGTTACG TAGCCGCTGN TCACX3CGCTC 1680 
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ACTCCTGACC ACATGCACGT TCCCTAGATG CAGACTGCTT TGAACTTTAA AGCTGTACAA 
TTTGGTTATG TTTGTGCTGA CTTAAAATAT ATTTTAATGA GGAAAAAATA ATGGAGAACC 
CTGGGAAGGA CXTQGTTCTT TTGCTTCTCG GGGAACTGTA AGCCCTCGCG TTCTGGGAAT 
CGCTCTCTGC TGCTCTTTCC TGGAAGCTAA QCCTGTCTCC ACCGCCCGAG GCCTGCGCCG 
GrPGCTCCCGC CGCAGTTGCG TTTGCTTTGG ACCTTGCGTG CGGGGGAGGG GC5TGCTCQGT 
CCGAGCCCGC TCCTTTCTGT ACACCTAGCG CTGCCCGCCC CGCTTCTGTC TGAGGTCGTG 
TATGTCAAAA ATAAAGOCGC TAGAAACX3GA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
AAACTCGAGG GQGQGCCCGT ACCXAATTAA CXX^WTATGA TCTATAAAGC GTC 

(2) INFORMATION FOR SEQ ID NO: 166: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1251 beise pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: dovible 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 166: 
GCCCACGCGT CCGCCCACGC GTCCGGCGGT GCGGAGTATG GGGCGCTGAT GGCCATGGAG 
GGCTACTGGC GCTTCCTQGC GCTGCTQGGG TCGQCACTGC TCGTCGGCTT CCTC3TCGGTG 
ATCTTCGCCC TCGTCTGGGT CCTCCACTAC CGAGAQGGGC TTGGCTGGGA TGGGAGCGCA 
CTAGAGTTTA ACTGGCACCC AGTGCTCATG GTCACCOGCT TCGTCTTCAT CCAGGGCATC 
GCCATCATCG TCTACAGACT GCCGTGGACC TGGAAATGCA GCAAGCTCCT GATGAAATCC 
ATCCAIQCAG GGTTAAA3X3C AGTTGCTG C C ATTCTTGCAA TTATCTCTGT GGTGGCOGTG 
TTTGAGAACC ACAATGTTAA CAATATAGCC AArATGTACA GTCTGCACAG CTGGGTTGGA 
CTGATAGCTG TCATATGCTA TTTOrTACAG CTTCTTTCAG GTTTTTCAGT CTTTCTGCTT 
CCATGGGCTC CGCTTrCTCT COGAGCATTT CTCATGOCCA TACATGTTTA TTCTQGAATT 
GTCATCTTTG GAACAGTGAT TGCAACAGCA CTTATQQGAT TGACAGAGAA ACTGATTTTT 
TCCCTC3AGAG ATCCTGCATA CAGTACATTC CCGOCAGAAG GrrGTTTPCGT AAATACQCTT 
GGCCTTCTGA TCCTOGTGTT CGGGGCCCTC ATTTTTTC3GA TAiSTCACCAG ACCGCAATGG 
AAACGTCCTA AGGAGCCAAA TTCTACCATT CTTCATCCAA ATGGAQQCAC TGAACAGGGA 
GCAAGAGGTT CCATGCCAGC CTACTCTGQC AACAACATGG ACAAATCAGA TTCAGAGTTA 
AACAGTGAAG TAGCAGCAAG GAAAAGAAAC TTAGCTCTGG ATGAGGCTGG QCAGAGATCT 
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10 



A(XATC3TAAA ATGTTGTAGA GATAGAGCCA TATAACGTCA CGTTTCAAAA CTAGCTCTAC 960 

AGTTTTGCTT CTCCTATTAG CCATATGATA ATTGGGCTAT GTAGTAOCAA TATTTACTTT 1020 

AATCACAAAG GATGGTTTCT TGAAATAATT TGTATTGATT GAGGCCTATG AACTGACCTG 1080 

AATTGGAAAG GATGTGATTA ATATAAATAA TAGCAGATAT AAATTGTGGT TATCTITACCT 1140 

TTATCTTGTT GAGGACCACA ACATTAGCAC GGTGCCTrGT GCAKAATAGA TACTCAATAT * 1200 

GTGAATATGT GTCTACTAGT AGTTAATTGG ATAAACTGGC AGCATCCCTG A 1251 



15 

(2) INFORMATION FOR SEQ ID NO: 167: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 882 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SBCJUENCE DESCRIPTION: SEQ ID NO: 167: 

25 

GACSMTCTAG AACTATGGTC CCCCGGGACT GCAGGAATTC GGCACAGCGG CTQCGGGCGC 60 

GAGGTGAGGG GCGOGAGGTT CCCAGCAGGA TGCCCCGGCT CTGCAGGAAG CTGAAGTGAG 120 

30 AGCXXXXSGAG AGGGCCCAGC CCGOCCQGGG CAGGATGACC AAGGCCCGGC TGTTCCGGCT 180 

GTOGCTOGTG CTGGGGTCGG TCTTCATGAT CCTGCTGATC ATCGTGTACT GGGACAGCGC 240 

AGGCGCCGCG CACTTCTACT TGCACAOGTC CTTCTCTAGG CCGCACACGG GGCCGCCGCT 300 

35 

GCCCACGCCC GGGCCGGACA GGGACAGGGA GCTCACGGCC GAYTCCGATG TCGACGAKTT 360 

TCTGGACAAK TTTCTCAGTG CTGGCGTGAA GCAGAGTGAC YTTCCCAGAA AGGAGAOGGA 420 

40 GCAGCOGCCT GOGCCGGGGA GCATGGAGGA GAGCGTGAGA RGCTACGACT GGTCCCCGCG 480 

CGAMSCCCGG CGCACCCAGA CCAGGGCCGG CAGCARGCGG ANCGGAGGAR CGTGCTGCGG 540 

GGCTTCTGCG CCAAYTOCAG CCTGGCCTTC CCCACCAAGG AGCGCGCATT CRAOGACATC 600 

45 

CCCAACTCGG AGCTGAGCCA CCTGATCGTG GAOGACOQGC ACGGGGCCAT CTACrOCTAC 660 

GTGCOCAAGG TGGCCTGCAC CAACTGGAAG CGCGTRATGA TCGTGCTGAG CGGAAGCTGT 720 

50 GCACOGOGTG CGCCEACOQC GACCCGYTGC GOTCCCGCGC GAGCACGTGC ACAACGCCAG 780 

CGCGCACTGA CTTCAACAAT TCTGGCGCCG CTACGQGAAG TCTCCCCCAC CTCATGAAGT 840 

CAAGCTCAAG AATACACCAA TTCTTTCTGC GCGACCCTTC TG 882 

55 



60. 



(2) INFORMATION FOR SEQ ID NO: 168: 
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(i) SEC^JENCE CHARACTERISTICS: 

(A) LENGTH: 1208 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECKESS : doiable 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 168: 
GQGAAACTCA AAAGGATGAT GGAATGGTTG ATGGAGCCAG AGCCTAGAAG TRAAGGGATA 
CAGAGTGAAG ATAGAGGTAT TTACGTATAT TTWaATATTA GCTTTGGAAT TACGTAGGGA 
TTCTTAAGAA AAGATCATGA CAGGACAGCC ACATTTQGTA AAATGTCAGG GCAGCCAGTG 
CATGGrrCCTC CTQGGGCTCC TCAGTTGACG GGTTTAAATC ATTTCCTGAT CCCCCTGCCC 
TGGTTTGAGG AATGCATACA GTACGTGAAA TGCCTGTGGT ATGAGTTGCA ATGGGCAATC 
AACCTQQGTA AATCCAAGAT TAATGATTAfi TTCTAAAGAT GCAGTTGAAG TTCTAGAGTG 
GGAATTTTCC GTCAAGCARC TCAGCACAGC TTTATGCCTG TTCCTCTAAT AACGATAGGT 
AACAAATAGC TGTGTKTWCA CAGCTAQGAR GATAACCAAA TCTAGAGTTC TTGARTCTCA 
TTTAATAAAT AAKTATTATG AGTACCAACT QCATATTTCA GGCACTGCAT TTGACTCTGT 
TAAATACTGA TYCCTTAKGA CMSCCACWTC AGAWAAOCTT AATCTGTCTG ATCAATAAAC 
AGCTTGACTT AGAC21GGTAA AATAGCTTGC CACAGGTWAC CCAATTAGTA GGTAACAGCG 
ACAGAATAAC AGTQCAGrTTA AAATCTTAGA CTGGAGACTA ATTGCATAAG TTTGAATTTC 
ACTTCrGCTA TGTAAATTTG GGTGAGTACC TTAATTYACC TGAGTCTOGG TCTTTATATC 
TGTAGAATGG AGCTAATGAT ATTACTTAAT TrGCTTTATG TGAGATTAAA TGTACTAATA 
TATGTAAATC ACTTACAACA GCAtTTGACA TATTTGACAT ACTTAATATA TTTGCTACTA 
ATACTATTAG CAACAGCATT CTGATTTTCC AAGrTTGAAAT TCAGTGnTT CTTTTTTACT 
TTGCCATAAT TTACAATGTT GTGCTCTGTA AACCATAAAT TTCCCTGAGG TGTTGTCAGG 
TEAAAAAAAA ATCACTATGG CCCCCARNMA CTTOGAAAAT AGAAATGAGA CCAGCTTCAT 
CTATATTCTT TACTGCAAAT AACTTAGAAT TGTAATAGGC TAATATGTAC TGGGACTTCC 
AATTTGGGAA TATGACAAAA ATAATACTAT TTAGCTAAAA CATATACAGA ACTTATTTTT 
CCTCTGAA 

(2) mFORMATION FOR SEQ ID NO: 169: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1307 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEt3NESS: double 

(D) TOPOLOGY: linear 
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10 



15 



20 



25 



30 



35 



40 



45 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 169; 

GGCACGAGAG AAAAGAQGTT GAGAATGTTT TCTAGCAGGC AGAATGTGCA TACATCOTTT 60 

CATGARTGTC CTTrGGGrTGC TGTTTCTTrT AAATCCTCTG TGCACAGGC3C TCTGGCCTTT 120 

ARTAAACTGT TrTTCTGTCT TACGTCATGC TGACTGGGTG CTAGGGGCTG ATTACAAAGG 180 

GGAAGAGTTG AACAGACATC AGGGGCCGAT GAAACCAAAG GACTAGGAGT CAGGAGAACA 240 

AGTCflGGGAT TAGGAGACAG CGGTTTGGTT TATTGTTATC CAGCTQGAGG ACTCCTAGGG 300 

GCAGCAGCAG GAGGAATACC AGGGCCACGG AGGGGCAGGA CJTCTCACAGT GGAGQGCAGA 360 

CTCTAACAGA TGCCAGCTGA ACGCTCGCTG GCCCTGGATG TCATACGAGT TGGGGACCAG 420 

AAATCTGGGC TCAGAGAACC CGTCCAGGGA GATTTGAAGC CATGQGTTAT CTTCTAGAGrr 480 

TGATACTGAT AATATATTTT AATTTTTATT GATGTTTAAT ACCTTCTGAA ACAGGAGGGT 540 

AAGATCflGAT GGGAAGCCCY TCTGTTGAAG GATCTTOGGA ACCTTGGTGG T riTXTI ' m ' 600 

TTGGTrTriT TTTTTTTGAT CGAGCTGTGG ACATCCTTCT TAATTCGATT ^3TGAGGATTT 660 

GTTTAACTAA AAAGTTCCCA AACACAGAAA GGGCCTCCCC ACCTGCTTTG GGGAGCTGTC 720 

TGrrSCTGGGA GTGCCAOSCA TCCSATQGGA CCCATCACTG OCAGrTGTCTG TGCCTCCCAG 780 

AGGTCAGCCC TGTGTCTGCC CTGGCTCTGT CTCCTCTGTG ACAGGGCAGA GCATTTCTGG 840 

TCAGTTTCTC CATGGTGCCT CCCACCCCTT TGTAAAGTGG ATGGACATGA TGGAATTCAG 900 

TTGTCrCACC CTGATAGCCT GGGTGTIGAT ATTCACTTTA CCCGCACTCA GACACAGGCG 960 

ACCTTGAAGC AGTTCTCGGT GTGTAGAGTC CACGTGACAG TCCOCACAGC CTCCCCAGAT 1020 

AGCTGTOTGC CTGTGCGCTA CTGCPGTGCC ATTTTCCCAA CTTNGGCGTr TCACTAAATC 1080 

CAGCTGATCT CTCTCTCTGT GCACTCCTGA TCCATGTTGA ACAATACATG TAQGTTCrTT 1140 

TTCCACGCAA TGTAAGAACA TGATATACTG TACGTTGGAA AGCATTTACC TTATTTATAT 1200 

ACCTGAATGT TCCTACTACA CAAATAAACA TATATTAAAT WCTAAAAAAA AAAAAAAAAA 1260 

CTGGAQGGGG GGCCCGGTAC CCAAATCGCC GGATAGTGAT CGTAAAC 1307 



50 



55 



60 



(2) INFORMATION FOR SEQ ID NO: 170: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LQKjTH: 1624 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : doiible 

(D) TOPOLOGY: linear 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 170 
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GGCACGAGGT COCCGCCGCG GCCGCCTGGA ATTGTGGGAG TTGTGTCTGC CACTCGGCTG 60 

CCGGAGGCGA AGGTCCCTGA CTATGGCTCC CCAGAGCCPG CCTTCATCTA C3GATGGCTCC 120 

TCTGQGCATG CTGCTTGGGC TGCTGATGGC CGCCTGCTTC ACCTTCTGCC TCAGTCATCA 180 

GAACCTGAAG GAGTTTGCCC TGACCAACCC AGAGAAGAGC AGCACCAAAG AAACRGAGAG 240 

AAAAGAAACC AAAGCCXSAGG AGGAGCTGGA TGCCGAAGTC CTGGAGC3TGT TCCACCOSAC 300 

GCATGACTOG CAGGCCCTTC AQCCAGGGCA GGCTGTCCCT GCAGGATCCC ACGrEACQGCT 360 

GAATCTTCAG ACTGGGGAAA GAGAGGCAAA ACTCCAATAT GAGGACAAGT TCOGAAATAA 420 

15 TTTGAAAGGC AAAAOOCTGG ATATCAACAC CAACACCTAC ACATCTCAGG ATCTCAAGAG 480 

TQCACTGGCA AAATTCAAGG AGGQGGCAGA GATGGAGAGT TCAAAGGAAG ACAAGGCAAG 540 

GCAGGCTGAG GTAAAGCGGC TCTTCCGCCC CATTGAGGAA CTGAAGAAAG ACTTTGATCA 600 

20 

GCTGAATGTT GTCATTGAGA CTGACATGCA GATCATQGTA CGGCTGATCA ACAAGrTTCAA 660 

TAGrrrCCAGC TCCAGTTTGG AAGAGAAGAT TGCTGCGCTC TTTGATCTTG AAIATTATGT 720 

25 CCATCAGATG GACAATQCGC AGGACCPGCT TTCCTTTGC3T GGTCTTCAAG TC3GTGATCAA 780 

TGQGCTGAAC AGCACAGftGC CCXnXXSTGAA GGAGTATC3CT GCGTTTGTGC TGQGCX3CTGC 840 

CTTTTCCAGC AACCXXMGG TCCAGGTQGA GGCCATCGAA GQGQGAGCCC TGCAGAAGCT 900 

30 

GCTOGTCATC CTOGCCACGG AGCAGCCXXTT CACTGCAAAG AAGAAGGTCC TGrTTPGCACT 960 

GTOCTCCCTG CTGCGCCACT TCCCCTATGC CCAQCGGCAG TTCCTGAAGC TCGGQGGGCT 1020 

35 GCAGGTCCTG AQGACCCTGG TGCAGGAGAA GGQCACXX3AG GTGCTCGCX33 TGCGCX3TOGT 1080 

CACACTQCTC TACGACCTGG TCACGGAGAA GATGnTCGCC GAGGAGGAGG CTGAGCTGAC 1140 

CCAGGAGATG TCCCCAGAGA AGCPGCAGCA GTATOGCCAG GTACACCTCC TGCXaGQCCT 1200 

40 

GTGGGAACAG GGCTGGTGCG AGATCACGGC CCACCTCCTG GCGCTGOCXrG AGCATGAIGC 1260 

CCGTGAGAAG GTIGCTGCAGA CACTGGGCGT CCTCCTGACC ACCTGCOGGG ACCGCTACCG 1320 

45 TCAGGACCCC CAGCTCGGCA GGACACTGGC CAGCCTGCAG GCTGAGTACC AGGrTGCTGGC 1380 

CAGCCTGGAG CTGCAGGATG GTGAQGACGA GGGCTACTTC CAGGAGCTGC TGGGCTCTGT 1440 

CAACAGCTTG CTGAAGGAGC TGAGATGAGG CCCCACACCA GGACTGGACT GGGATGCCGC 1500 

50 

TAGTGAGGCT GAGQGGTGCX: AGCGTGGGTG QGCTTCrCAG GCAGGAGGAC ATCTPGGCAG 1560 

TGCTGGCTTG GCCATTAAAT GGAAACCTGA AGGCX^VAAAA AAAAAAAAAA AAAAAAAAAA 1620 

55 AAAA 1624 



60 (2) INFORMATION FOR SEQ ID NO: 171: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2003 base pairs 

(B) TYPE: nucleic acid 

5 (C) STRANDECNESS: double 

(D) TOPOIOGY: linear 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 171: 

10 GGCACGAGCC AGCTTGCACG AGGAATCGGT GAGGTCCTGT CCTGAGGCTG CTGTCCGGGG 60 

CCGGTGGCTG CCCTCAAGGT CCCTTCCCTA GCTGCTGCGG TTGCCATTGC TTCTTGCCTG 120 

TTCTQGCATC AGGCACCTGG ATTGAGTTGC ACAGCTTTGC TTTATCCGGG CTTGTGTGCA 180 

GGGCCCGGCT GGGCTCCCCA TCTQCACATC CTGAGGACAG AAAAAGCTGG GTCTTGCTGT 2*40 

GCCCTCCCAG GCTTAGrTGTT CCCTCCCTCA AAGACTGACA GCCATCGTTC TGCACGGGGC 300 

20 TTTCTGCATG TGACGCCAGC TAAGCATAGT AAGAAGTCCA GCCTAGGAAG GGAAGGATTT 360 

TGGAGOTAGG TGGCnTOGT GACACACTCA CTTCTTTCTC AQCCTCCAGG ACACTATGGC 420 

CTGTTrTAAG AGACATCTTA TTTTTCTAAA GGTGAATTCT CAGATGATAG GTGAACCTGA 480 

25 

GTTGCAGATA TACCAACTTC TGCTTGTATT TCTTAAATGA CAAAGATTAC CTAGCTAAGA 540 

AACTTCCTAG GGAACTAGGG AACCTATGTG TTCCCTCAGT GTGGTTTCCT GAAGCCAGTG 600 

30 ATATGGGGGT TAGGATAGGA AGAACTTTCT CGCTAATGAT AAGGAGAATC TCTTGTTTCC 660 

TCCCACCTGT GTTGTAAAGA TAAACTGACG ATATACAGGC ACATTATGTA AACATACACA 720 

CGCAATGAAA CCGAAGCTTG GCGGCCTGGG CGTGGTCTTG CAAAATGCTT CCAAAGCCAC 780 

35 

CTTAGCCTGT TCTATTCAGC GGCAACCCCA AAGCACCTGT TAAGACTCCT GACCCCCAAG 840 

TGGCATGCAG CCCCCATGCC CACCGGGACC TGGTCAGCAC AGATCTTGAT GACTTCCCTT 900 

40 TCTAGGGCAG ACTQGGAGGG TATOCAGGAA TCGQCCCCTG CCCCACGGGC GTTTTCATGC 960 

TGTACAGTGA CCTAAAGTTG GTAAGATGTC ATAATGGACC AGTCCATGTG ATTTCAGTAT 1020 

ATACAACTCC ACCAGAOCCC TCCAACCCAT ATAACAOCCC ACCCCTGTTC GCTTCCTGTA 1080 

45 

TQGTGATATC AIATGTAACA TTTACTCCTG TTTCTGCTGA TTGrmnT AATGTTTTGG 1140 

TTTOTTTTTG ACATCAGCTG TAATCATTCC TGTGCTGrrGT TTTTTATTAC CCTTGGTAGG 1200 

50 TATTAGACTT GCACTTTTTT AAAAAAAGGT TTCTGCATCG TGGAAGCATT TGACCCAGAG 1260 

TGGAACGCGT GGCCTATGCA GGTGGATTCC TTCAGGTCTT TCCTTTGGTr CTTTGAGCAT 1320 

CTTTGCTTTC ATTCGTCTCC CGTCTITGGT TCTCCAGTTC AAATTATTGC AAAGTAAAQG 1380 

55 

ATCTTTGAGT AGGTTCGGTC TGAAAGGTPGT GGCCTTTATA TTTGATCCAC ACAOGTTGGT 1440 

CTTTTAACCG TGCTGAGCAG AAAACAAAAC AGGTTAAGAA GAGCCQGGTG GCAGCTGACA 1500 

60 GAGGAAGCCG CTCAAATACC TTCACAATAA ATAGTGGCAA TATATATATA CTTTTAAGAAG 1560 
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GCTCTCCATT TQGCATCGTT TAATTTATAT CTrTATGTTCT AAGCACAGCT CTCTTCTCCT 
ATTTTCATCC TGCAAGCAAC TCAAAATATT TAAAATAAAG TTTACATTCT AGTTATTTTC 
AAATCTTTGC TTGATAAGTA TTAAGAAATA TTGGACTrGC TGCXX3TAATT TAAAGCTCTG 
TTGATTTTGr TTCCGTTTGG ATmTGGGG GAGGGGAGCA CTGTGTTTAT GCTGGAATAT 
GAAGTCTGAG ACCTTCCGGT GCTGGGAACA CACAAGAGTT GTTGAAAGTT GACAAGCAGA 
CTGCGCATGT CTCTGATGCT TTGTATCATT CTTGAGCAAT CGCTCGCTTCC GTGGACAATA 
AACAGTATTA TCAAAGAGAA AAAAAAAAAA AAAAAACTCG NGGGGGGC3CC OGGTACCCAA 
TTOGCCCTAT AGTGAGCCNA TTC 



(2) INFORMATION FOR ID NO: 172: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 786 bcise pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: doiible 

(D) TOPOLOGY: linear 

<xi) SBC^JENCE DESCRIPTION: SBQ ID NO: 172: 
GQCACAGCQG CACGAGAAGA CTTTGGTG T T TAAGAGATTA ATGTGTTAGC CAGAACAACT 
CATTTCTCTA CCMGTGTGTA GTCCATTTAT CTTTAAAGAT TTTCTATTGG AATAATTTTG 
AAATTACTTT CTTAGmTTC TTCATTAAAA ACTAAGAAAA TGCTTTGrTTT ATTATGAATT 
GCTATTTCTC TTGATTATTA TTCTTGGAGA AAGTCTATCA GACGTAATTC TTCTGATTTG 
CTTCTAGGCT AGAQGAAAAT GTGAAAGATG ACAAATGAAA ATTTCAAAQG TTGTCAGTAG 
TATGACTTCT TTTATCGTTT GTCATTATCA CAAATATATC AACATAGGAC TTTTAAAAGA 
TATrrTGTAC ATATTGGGCC TTAGTAGGAT TTIXSCATGAA ' ITlTi ' mTr CTTTTATGCC 
CAGAGAGAAA GAGCAAAGAA ATAACCAAGG GTGATGTACT OGTATTGAAG GTTTACCAAA 
TAAGGACTGC TTTTATTATG AACTATAGTC TATATTCTAA CTAAATCAAT TTTTCTATTA 
TG TC r m TT GTTCCTGCAG GCAAGATCTC TGAACTTTAT GCAGAGGGTT CTTTTAAAAA 
AACAAAGTTG AATTTTTTTA TTTCTTGGAA TATTTTTTTT CATTGAITTC TCCCAAGTAG 
AGCAGATTCA AATCTCCTTT GTACCCTATG TCTTTTTTGT TTTGCTATTA GCTCAGTATT 
COGTTTCTAC ATTTTCCTTT CCTAGAACCA GTCAATAAAT GACAAAAAAA AAAAAAAAAA 
ACrCGA 
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(2) INFORMATION FOR SEQ ID NO: 173: 



15 



20 



25 



30 



(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 1758 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEEtttESS: double 

(D) TOPOLOGY: linear 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 173: 

GQGACGAGCC CTGCCCACCT CCTGCAGCCT CCTQCGCCCC GCCGAGCTGG CGGATGGAGC 
TOCGCACGGG GAGCGTGGGC AGCCAGGCGG TGGCGCGGAG GATGGATGGG GACAGCCGAG 
ATGGCGGCGG CGGCAAGGAC GCCACCGGGT OGGAGGACTA CGAGAACCTG CCGACTAGCG 
CCTCOGTGTC CACCCACATG ACAGCAGGAG CGATGGCOGG GATCCTGGAG CACTCGGTCA 
TGTACCCGGT GGACTCGGTG AAGACACGAA TGCAGAGTTT GAGTCCAGAT CCCAAAGCCC 
AGTACACAAG TATCTAOGGA GCCCTCAAGA AAATCATGCG GACCGAAGCT TCTGGAQGCC 
CTTGCGAGGC GTCAACGTCA TGATCATGGG TGCAGGGCCR GCCCATGCCA TGTATTTrGC 
CTGCTATGAA AACATGAAAA GGACTTTAAA TGACGTTTTC CACCACCAAG GAAACAGCCA 
CCTAGCCAAC QGTATTTTGA AAGOGTTTGT CTGGAGTTAG AAAGrTTCTCT TCTTCAACAC 
GTCCCTCCCC AGGGrrGTTCC TCCCTGTGAC CCAGCCGCCT CGACTTCGGC CCGCTTGCTC 
AOGAATAAAG AACTCAGAGT TGTGTGTGCA ATGCACACCC AGACACACGC ACGCACACAC 
ACGCGCGCGC ACACACATGC TTTTTTCTGT TCCCCTCCGC TTTCTGAAGC CTGGGGAGAA 
ATCAGTGACA GAGGTGTTTT GGTTTTATTG TTATGTGGGT TTTCnTTGT ATTTTTTTTG 
TTTGTTTTGT TTTTAAACAT TCAAAAGCAA TTAATGATCA GACATAGGAG AAACCCTGAA 
40 TAGAAACAAA ACTTTTGAAT GCTQGATTCA AAAAAAAAAA AAAGTTATCT GGACAGCTTC 
TTTGAGACTA TTTAAAAACT GGTACAACAG GTCTCTACAA CGCCAAGATC TAACTAAGCT 
TTAAAAQGTC AAGAAGTTTT ATGGCTGACA AAGGACTCGC GCAACGCAGA AGOCCTTTCC 

45 

CACCTTAAGC TTCOQGGGAT CTGGGAAITT TACCCCCATT CTCTTCTGTT TGTCTGAGTC 
TCATCTCTCT GCAAGCAAGG GCTGAAATCA TTTTGnTTGG TTGTTTTGAG GGAGAGAGGC 
50 GGQGTGGGGG GGTGCAAATC TGCCAGCAGC TCTTACGTAA GGCATGTTTT ATTGGGGAGG 
GCTGAGCTTT TATTTTCTCC TCTCCAGTGG GGTTGGCTTT TATTGTTTCT TGTTTGGGrrT 
TGGAATGGAA ATATGGATAG CAGCATAAAG TACTTTTATT TTGACAAAAT TCATTTTTTT 

55 

CAACAATQGA GACATAGATT TGACCCACAA TAACTTCTCC CCCTCTCTTT TTACTCTGCT 
CAAAAAGCAT CTCTCCTCCC ATTACCCAAC CTTGGTCATA AGTGTGCCTG GCTGGTTEGC 
60 AGATATTTGT TCTGCTrTGT AAAAATTGGC CATTAGTGCA TTTATTGAGA TGATCTCTAA 



35 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
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AGA£3CTATGC CCTGACCTAC CCCTGATTCT ATGACATTGG GGCCCTTCTT TTGCTGAAAC 1560 

TGCCTTACGT AATGGTTTTA CTCCTTGAAA GAGATTTGAC GGAATCCATT TTATGCCAAG 1620 

5 

TGCTGCCCTG CACTGTTTCT GCAATATGTG GTGTATGCTG TGGTGATCTT GCTOGGAATG 1680 

ATTATAAGTG TCTGTGTGGT GGGQGAGrTGG GTATTACATG CATTGCTGAA GAGTCAAAAA 1740 

10 AAAAAAAAAA AAACTCGA 1758 



15 (2) INFX)RMATION FOR SEQ ID NO: 174: 

(i) SEQOENCE CHARACTERISTICS: 

(A) LENGTH: 888 base pairs 
<B) TYPE: nucleic acid 
20 (C) STRANDEENESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 174: 

25 CTGTTAGAAT GCCCAGTTTA CCTGGATGGC AACCCAACAG TGCTCCTGCC CACCTGCCCC 60 

TCAATCCTCC TAGAATTCAG CCCCCAATTG CCCAGTTACC AATAAAAACT TGTACACCAG 120 

CCCCAGQGAC AGTCTCAAAT GCAAATCCAC AGAGTGASMC ACCACCTCQG GTAGAATTTG 180 

30 

ATGACAACAA TCCCTTTAGT GAAASTTTTC AAGAACGGGA ACGTAAGGAA CGTTTACGAG 240 

AACAGCAAGA GAGACAACGG ATCCAACTCA TGCAGGAGGT AGATAGACAA AGAQCTTTGC 300 

35 AGCAGAGGAT GGAAATQGAG CAGCATGGTA TGGTGGGCTC TGAGATAAGT AGTAGTAGGA 360 

CATCTGTGTC CCAGATTCCC TTCTACAGTT CCGACTTACC TTGTGATTTT ATGCAACCTC 420 

TAGGACCCCT TCAGCAGrTCT CCACAACACC AACAGCAAAT GGGGCAGGTT TTACAGCAGC 480 

40 

AGAATATACA ACAAGGATCA ATTAATTCAC CCTCCAOCCA AACTTTCATG CAGACTAATG 540 

AGCGAGGCAG GTAGGCCCTC CTTCATTTGT TCCTQATTCA CCATCAATCC CTGTTGGAAG 600 

45 CCCAAATTTT TCTTCTCTrGA AGCAGGGACA TGGAAATCTT TCTQGGACCA GCTTCCAGCA 660 

GTCCCCAGTG AGGCCTTCTT TTACACCTGC TTTACCAGCA GCACCTCCAG TAGCTAATAG 720 

CAGTCTCCCA TGTGGCCAAG ATTCTACTAT AACCCATGGA CACAGTTATC OQGGATCAAC 780 

50 

CCAATCGCTC ATTCAGTTGT ATTCTGATAT AATCCCAGAG GAAAAAGGCajJ AAAAAAAARA 840 

AMAARAAARA ARAAAGGAGA TGATGATGCA GAATTCCACC AAGGCTCC 888 

55 

(2) INFORMATION FOR SEQ ID NO: 175: 
60 (i) SEQUENCE CHARACTERISTICS: 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



(A) LENGTH: 2379 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

(xi) SEC2UENCE DESCRIPTION: SEQ ID NO: 175: 

GGCAGAGCTA GTGTQGACTC CATCCCCCTG GAGTGGGATC ADSNCTATGA CCTCAGTCQG 60 

GACCTGGAC5T CTGCAATC3TC CAGAGCTCTG CCCTCTGAGG ATGAAGAAGG TCAGGATGAC 120 

AAAGATTTCT ACCTCCGGGG AGCTGTTGSC TTATCAGGGG ACCACAjGTGC CCTAGAGTCA 180 

CAGATCCGAC AACTGGGCAA AGCCTGGATG ATAGCCGCTT TCAGATACAG CAAACCGAAA 240 

ATATCATTCG CAGCAAAACT CCCACGGGGC CGGAGCTAGA CACCAGCTAC AAAGGCTACA 300 

TGAAACTCCT GGGCGAATGC AGTAGCAGTA TAGACTCCX3T GAAGAGACTG GftGCACAAAC 360 

TGAAGGAGGA AGAGGAGAGC CTTCCTGGCT TTGrTTAACCT GCATAGTACC GAAACCCAAA 420 

CGGCTGGTGT GATTGACCGA TGGGAGCTTC TCCAGGCCCA GGCATTGAGC AAGGAGTTGA 480 

GGATGAAGCA GAACXTPCCAG AAGTGGCAGC AGTTTAACTC AGACTTGAAC AGCATCTGGG 540 

CCTGGCTGGG GGACACGGAG GAGGAGTTGG AACAGCTCCA GCGTCTGGAA CTCAGCACTG 600 

ACATCCAGAC CATCGAGCTC CAGATCAAAA AGCTCAAGGA GCTCCAGAAA GCTGTGGACC 660 

ACCGCAAAGC CATCATCCTC TOCATCAATC TCTGCAGCXX: TGAGTTCACC CAGGCTGACA 720 

GCAAGGAGAG CCGGQACCTG CAGGATCGCT TGTSGCAGAT GAATGGGCXX: TGGGACCGAG 780 

TC?rGCTCrCT GCTGGAGGAG TGGCGGGGCC TGCTGCAGGA TGCCCTGATG CAGTGCCAGG 840 

GTTTCCAaXSA AATGAGCX::AT GGrnTGCTTC TTATGCTGGA GAACATTGAC AGAAGGAAAA 900 

ATGAAATTGT CCCTATTGAT TCTAj^CCTEG ATGCAGAGAT ACTTCAGGAC CATCACAAAC 960 

AGCTTATGCA AATAAAGCAT GfiGCTGTTOG AATCCCAACT CAGAGTAGCC TCTTTGCAAG 1020 

ACATGTCTTG CCAACTACTG GTGAATGCTG AAGGAACAGA CrGTTTAGAA GCCAAAGAAA 1080 

AAGTCCATGT TAITQGAAAT CGGCTCAAAC TTCTCTTGAA GGAGGTCAGT CX3TCATATCA 1140 

AGGAACTGGA GAAGTTATTA GACGTGTCAA GTEAGTCftGCA GGATTTGTCT TCCTGGTCTT 1200 

CTGCTGAOXSA ACTGGACACC TCAGGGTCTG TGAGTPCCCAY ATCAGGAAGG AGCACCCXAA 1260 

ACAGACAGAA AACGCX:ACGA GGCAAGTGTA GTCTCTCACA GCXTQGACCC TCTGnCAGCA 1320 

GTCCACATAG CAGGTCCACA AAAGGTGGCT CCGATTCCTC CCTTTCTGAG CCARGGCCAG 1380 

GTCGGTCCGG COGOSGCTTC CTGTPCAGAG TCCTCCGAGC AGCTCTTCCC CTTCftGCTTC 1440 

TCCTOCTCCT CCTCATOGGG CTTQCCTGCC TTGTACCAAT GTCAGAGGAA GACTACAGCT 1500 

GTCCCCTCTC CAACAACTTT GCCCGGTCAT TCCACCCCAT GCTCAGATAC ACGAATGG(:x: 1560 

CTCCTCCACT CTGAACTAAG CAGATGCCAT CTGCAGAAGT GCTGGTAGCA TAAGGAOGAT 1620 



wo 98/54963 



PCT/US98/11422 



427 



CGGGTCATAA GCAATCCCAA ACTACCAACA AGAGGACCTT GATCTTGGCG AAAGCCMTCG 1680 

GTGTGGCAGC TTTAGCCTCC TCCAGATCAC ATGTGTGCAA ATTATGGCTT CAGAGGTQGA 1740 

5 

AGATAAACAG TGAC3GGGQGA ACAAACAGAC AACAAGAAGG TTTGGAAGAA ATCTGCOTTG 1800 

AGACTCTGAA CX:TTAGCACT AAGGAGATTG AGTAAGGACC TCCAAAGTTC CCCGGACTCA 1860 

10 TGAATTCTGG GCCCTTGGCC KIATTCTGTGC ACAGCCAAGG ACTTCAGTAG ACXATCTCGG 1920 

CAGCTTTCCC ATGGTGCTGC TCCAACCATC AGATAAATGA CCCTCCCAAG CACCATGTCA 1980 

GTGTCGTACA ATCTACCAAC CAACCAGTGC TGAAGAGATT TTAGAACXTTT GTAACATACA 2040 

15 

ATTTTTAAGA GCTTATATGG CAGCTTCCTT TTTACCTTGT TTTCCTTTGG GGCATGATGT 2100 

TTTAACCTTT GCTTTAGAAG CACAAGCTGT AAATCTAAAA GGCACTTTTT TTTAGAGGTA 2160 

20 TAAAGAAAAA CTAGATGTAA TAAATAAfiAT CATQGAAGGC TTTATGTGAA AAAAGTTGAA 2220 

TGTTATAGTA AAAAAAAAAG ATATTTATGT ATGTACAGTT TGCTAAAGGC AAGTTTTGTT 2280 

TGTATTGATT TCTTTGCATT TATTATAGAT ATTATAAAAT AAAAAAAAAA AAAAAAAAAC 2340 

25 

TCGAGGGGGG GCCCGGTACC CAATTCGCCC TATAGTGAG 2379 



30 

(2) INFORMATION FOR SEQ ID NO: 176: 

(i) SEQUENCE CHARACTEEIISTICS : 

(A) LENGTH: 1348 base pairs 
35 (B) TTPB: nucleic acid 

(C) STRANDECMESS : double 

(D) TOPOIiOG^: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 176: 

40 

GCGCCTPCAC GATGCCGGCG GTCAGTOGTC CAGGTCCCTT ATTCTGCCTT CTCCTCCTGC 60 

TCCTGGACCC CCACAGCCCT GAGACGGQGT GrPCCTCCTCT ACGCAGGTTT GAGTACAAGC 120 

45 TCAGCTTCAA AGGCCCAAGG CTGGCATTGC CTGGGGCTQG AATACCCTTC TGGAGCCATC 180 

ATQGAGGTGA GGGGCAGGGG TGGGGACCGC TATGCCCAGG GTCCCTCAAA GTGCTGGAGG 240 

GGCTGTRACT TGGTGGGGAG TQGGTCTGTC ACAGCCATCC TCTGTCCAGG GTGQGGCAAG 300 

50 

GOCTGQGACA GTGCCAGGCA CCCCAGGACC CCTTCCAGGC TTGTCTOCTG CTCCACCGCC 360 

TCAACAOCCC CCACCCCTGC CCAAGCTGTT TCTCCTCTGC CTCTCTNNTT CCCTGCOCCA 420 

55 GGACTTCTCT CTTCTCCTCT GCCTCTCCTT GGACCCCTGC CCTTCCTCTA CCTCTGACCT 480 

GTGAACACAC AGACACATGC TCACACACTA AGTCCCARGC ACACMSAAAG GCAATGTGGA 540 

CCAGCACAAA CCTCCACTCT CCCGGCTCCA TCCCARCGGG CCTGTGGCTG GCCATQAAAA 600 

60 
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AGCATCCTCA TCCCAGGTGA GTGGGCACCA GCCCTTCCCT 660 

AGCflGGCATG AGAGCATCTT AGCCCATAGG TTTGTATTCA 720 

TACAAAGAGT GTCTrCTTCTA CCAGATCTTG TTCAAAAAAG 780 

CACGATAGAG GGAGTGAGCA AGAACAATGA GGATTAGAGT 840 

AGCATGGCTT CCAAAACATA TGCTGTGAGG TCTGTCCACC 900 

TAATTCTGAG CCTCTTAGCA GGCAAAGCAA AGACAGAAAG 960 

GTCTATAAAA TGTGAGTTCT TGGCCGGGTG OGC?rGGCTCA 1020 

TGGGAGGCCA QGGCGGATGG GTCGCGAGGT CAGGAGGTTG 1080 

GGTGAAGCCC TGACTCTACT AGAAGTGCAA AGATTGGCTG 1140 

TQGTCCCAGC TTCTCGGGAG GCTGAGGCGG GAGAGTTCCT 1200 

TGCGGTGAGC TGAGATCCTG CCATTGCACT TCAGCCTGGG 1260 

CAAAAAAAAA AAAAAAAAAA ACTCGAGGGG GGCCXCTACC 1320 

TAAACAAT 1348 

30 (2) INFORMATICN FOR SEQ ID NO: 177: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1502 base pairs 

(B) TYPE: nucleic acid 
35 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 177: 

40 CTCAAAATAA ATAAATAAAT AAAAATTTGT ATTCCATTGA TITOGGTAGA CACCAGGAAT 60 

GTGCATTTCT AACAAGCTTT CCAGGCGATC CTATAGTEAAG TCATCTGTGG ACTACTTTAA 120 

GAAACTCTTC TATAGAGAAT GGAGTTGGAT TAATAATAGG TGATTTTTTA CACTGGACTG 180 

45 

ATTCACAAGA ACCTAAACAG TAGTCCATGA AGCTQCTCAT CTGTGGTAAC TATTTGGCCC 240 

CGTCTCACTC TQAAAGCAGC AGGAGATGTT GTTTACTTTG TTTCTATCCC CTTTGTCTGG 300 

50 AGATTAATTT TQGAATGAAA GTrPTTCTCT CTATGCCATT CCTGGTTCTT TTCCAAAGCC 360 

TCATACAAGA QGATTAGGTC ACAATGCATG CATTACCTTT TAAAAGAATG CGATATTGAT 420 

ACCGATGCTT ACTTTTTTTT TTTTTNACTA CTTGrrTTAT TCCTTCCAOI AAAGTATAGC 480 

55 

CCGCCTTTCT ATAQCATAGT TCTCTTTAGG TGGAATGATT CCTATAAGAT TTCTCATTAT 540 

TAAATCATGC ATTTTrCAAG ATGGAATCAA TMTTTGATTT AATCTAAGCT GATATTCTCA 600 

60 TTTGTTAGAA GAACAACCTA CATGCTAGAG AGAGAGGAGG AAATATACCC ACGACCACAC 660 



CTGGQGGCTA CCTGGAGGGA 
GTATGTGTGT TGTGGGTGGA 
5 GGGACTTCCA AACCCAGACC 
GGTTTGTGAT GATGGAACTA 
GGAGCGTGAA ATAGTCTAGG 

10 

TGAGAGTTGG GCCATGGATT 
CAGATCGGCT GTGGATTTCT 
15 CGCCTGTAAT CCCOGCGCTT 
GAAACCATCC TGGCCGGAAT 
GGTGTGGTGG CGTGOGCCTG 

20 

TGGGCCTGGG AGGCCGAGGT 
CACAGAGCCA GACTCTGGCT 
25 CAATTCGCOG NATATGATCG 
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AGCCAGTTAG TATCCAGTTG GTGCTGGACT CXJ^GCCAGGT GTCCTGCXTTC ATQGTAGrTA 720 

AATGATATAT AGAAAAGGTA AATTTTTAAA GAAATATTTA TTAATATATT CCTATAAAAC 780 

5 

ATTTTAAAGG TAACCACATA AAAATGGTTA ATTTTTCCAT TCCAAAGTAA ATGCTAAGCA 840 

TGTTTATTAA TGAAGCAGTA CTTCTGATrA CTATATGACA TTCTGAAGTT AATTAAACTC 900 

10 ATTGCACTAA ATGTGTCTTC CTTQGTATAG TGGAGGA1TT GAGGATTGGA ATATAGAGTA 960 

GAGTGCrrGC TTAAGCCTGG GAGCCCATCT TTATAGCTAT TTGATGTAAG AAAAGAGACA 1020 

TGGNCCATTT CTAAACTATA TAAGGTTGAGT OrGTCTATTC CCAGCAGATA TAAAQGAAAA 1080 

15 

AGGAAACTTT TTTGATTCCC ACX:TrCCCAG CCTCACCTAG CCATCTTCCA GCCTCAAATA 1140 

TAGAGATCTT AGrTQCAAGGT CCTGGGCTCT AGGTGATCAT TTCATAAGTC CTTTACAGAT 1200 

20 AAAGAAAAAG TAGTGTITGT AaXTTTPGTTT TTAAGTAACC CCAAAACAAA TTTATATTGT 1260 

ATTCAGCAAA ATTGGAATTC AQGTGTTTAA TTTTAGAACA TGAftGTGCCT GCTGTTTTAA 1320 

GCATTGACTT GTATAAAAAG AATTGCATGT CTCCAGTAAG CITATGQGTT TTCTCATTTT 1380 

25 

TAQGTATATG GCTTTTAATC ATGTAAAGTG AAACATTAGT TTTCTTGCAT TTTATTACAG 1440 

CJl ' l ' CiTm ' i T GCAATAAAGA TQCTGCTGAA ATTAATTGAA AAAAAAAAAA AAAAAAACTC 1500 

30 GA 1502 



35 (2) INFORMATION FOR SEQ ID NO: 178: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1637 base pairs 

(B) TYPE: nucleic acid 
40 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 178; 

45 ATTTTCTAGC CCACAAGGAC TGAAGTTCAG ATCCAAAAGT TCAdTGCTA ATTATCTTCA 60 

CAAAAATGGA GAGACTTCTC TTAAGCCAGA AGATTTTGAT TTTACTGTAC TTTCTAAAAG 120 

GGGTATCAAG TCAAGATATA AAGACTGCAG CATGGCAGCC CTGACATCCC ATCTACAAAA 180 

50 

CCAAAGTAAC AATTCAAACT QGAACCTCAG GACCCGAAGC AAGTGCAAAA AGGATGTGTT 240 

TATGCCGCCA ACTAGTAGTT CAGAGTTGCA GGAGAGCAGA GGACTCTCTA ACTTTACTTC 300 

55 CACTCATTTG CTTTTGAAAG AAGATGAGQG TGTTGATGAT GTTAflCTTCA GAAAGGTTAG 360 

AAAGCCCAAA GGAAAGGTTGA CTATTTTGAA AGGAATCCCA ATTAAGAAAA CTAAAAAAGG 420 

ATGTAGGAAG AGCTGTTCAG GnTTGTTCM AAGTGATAGC AAAAGAGAAT CTGTGTGTAA 480 

60 
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TAAAGCAGAT GCTGAAAGTG AACCTGTTGC ACAAAAAAGT CAGCTTGATA GAACTGTCTG 540 

CATTTCTGAT GCTGGAGCAT GTQGTCAGAC CCTCAGTGTG ACCAGTGAAG AAAACAGCCT . 600 

TGTAAAAAAA AAAGAAAGAT CATTGAGTTC AGGATCAAAT rrTTGTTCTG AACAAAAAAC 660 

TTCTOGCA'TC ATAAACAAAT TTTGTTCAGC CAAAGACTCA GAACACAAOG AGAAGTATGA 720 

GGATACCTTT TTAGAATCTG AAGAAATCGG AACAAAAGTA GAAGTTGTGG AAAGGAAAGA 780 

ACATTTCCAT ACTGACATTT TAAAACCTrOG CTCTGAAATG GACAACAACT GCTCACCAAC 840 

CAGGAAAGAC TTCACTGAAG ATACXIATCCC ACGGAACACA GATAGAAAGA AGGAAAACAA 900 

GCXnCTATTT TTGCAGCAAA TATAACAAAG AAGCTCTTAG CCXTCCACGA CGTAAAGCXTT 960 

TTAAGAAATG GACACCTCCT CGGTCACCTT TTAATCTCGT TCAAGAAACA CTTTTTCATG 1020 

ATCCATGGAA GCTTCTCATC GCTACTATAT TTCTCAATCG GACCTCAGGC AAAATGGCAA 1080 

TACCTGriGCT TTGGAAGTTT CTGGSGAAGT ATCCTTCAGC TGAGGTAGCA AGAACCGCAG 1140 

ACTCGAGAGA TGTGrTCAGAA CTTCTTAAAC CTCTTGGTCT CTACGATCTT CGGGCAAAAA 1200 

CCATTCTCAA GTrCTCAGAT GAATACCTGA CAAAGCAGTG GAAGTATCCA ATTGAGCTTC 1260 

A'TOQGATTGG TGCACCCTGA AGACCACAAA TTAAATAAAT ATCATGACIG GCTTTGGGAA 1320 

AATCATGAAA AATTAAGTCT ATCTTAAACT CTGCAGCTTT CAAGCTCATC TGTTATGCAT 1380 

AGCTTTOCAC TTCAAAAAAG CTTAATTAAG TACAACCAAC CACCTTTCCA GCXATAGAGA 1440 

TTTTAATTAG CCCAACTAGA AGCCTAGTCT GTGTGCTTTC TTAATGTGTC TGCCAATGGT 1500 

GGATCTTTGC TACTCAATGT GnTTGAACAT GTTTTGAGAT TnTTTAAAA TAAATTATTA 1560 

TTTGACAACA ATCCAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1620 

AAAAAAAAAA AAAAAAA 1^37 
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(2) INFORMATION FOR SBQ ID NO: 179: 

(i) SE^^JENCE CHARACTERISTICS: 

(A) LEtKSTH: 2911 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 179: 
GC?rGGTTTTT GrTTCTOCAAT AGGCGGCTTA GAGGGAQGGG CTTTTTOGCC TATACCTACT 
GTAGCTTCTC. CACGTATGGA COCTAAAGGC TACTGCTGCT ACTACGGGGC TAGACAGrTTA 
CTGTCTCAGC TCTAGGATGT GOGTTCTTCC ACTAGAAGCT CTTCTGAGGG AGGTAATTAA 
AAAACAGTOG AATQGAAAAA CAGTGCTC3TA GTCATCCTGT AATATGCTCC TTGTCAACAA 
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TOTATACATT CCTCCTAGGT GCCATATTCA ITGCTTTAAG CTCAAGTCGC ATCTTACTAG 
TCAACTATTC TGCCAATGAA GAAAACAAGT ATGATTATCT TCCAACTACT GTGAATGTGT 

5 

GCTCAGAACT GGTGAAGCTA GTrTTCTCTG TGCTTGTCTC ATTCTGTGTT ATAAAGAAAG 
ATCATCAAAG TAGAAATTTG AAATATGCTT CCTGGAAGGA ATTCTCTGAT TTCATGAACTT 
10 GCTCCATTCC TGCCTTTCTT TATTTCCTGG ATAACTTGAT TGTCTTCTAT GTCCTGTCCT 
ATCTTCAACC AGCCATGGCT GTTATCTTCT GAAATTTTAG CATTATAACA ACAGCTCTTC 
TATTCAGGAT AGTCCTGAAG ANGCGTCTAA ACTGGATCCA GrTGGGCTTCC CTCCTGACTT 

15 

TATrrTTCTC TATrcTGGCC TTGACTGCCG GGACTAAAAC TTTACAGCAC AACTTGC5CAG 
GACOTCGATT TCATCACGAT GCCrrTTTCA GCCCTTCCAA TTCCTGCCTT CTTTTCAGAA 
20 ATGAGTOTCC CAGAAAAGAC AATTGTACAG CAAAGGAATG GACTTTTCCT GAAGCTAAAT 
GGAACACCAC AGCCAGAGTT TTCAGTCACA TCCGrrCrrTGG CATGGGCCAT GTTCTTATTA 
TACTTCCAGTC TTTTATTTCT TCAATGGCTA ATATCTATAA TGAAAAGATA CTGAAGGAAG 
GGAACCAGCT CACTGAARGC ATCTTCATAC AGAACAGCAA ACTCTATTTC TTTGGCATTC 
TOTTEAATCG GCTGACTCTG GGCCTTCAGA GGAGTAACCG TGATCAGATT AAGAACTGTG 
GATnTTTTA TGGCCACAGrT GOVTTTTCAG TAGCCCTTAT TTTTGTAACT GCATTCCAGG 
GCCTTTCAGT GGCTTTCATT CTGAAGTTCC TGGATAACAT GTTCCATGrTC TTGATGGCCC 
AGCTTACCAC TCTCATTATC ACAACAGTGT CTGTCCTOGT CTITGACTTC AGGCCCTCCC 
TGGAATTTTT CTTGGAAGCC CCATCAGTCC TTCTCTCTAT ATTTATTTAX AATGCCAGCA 
AOCCTCAACT TCCGGAATAC GCACCTAGGC AAGAAAGGAT CCGAGATCTA AGTGGCAATC 
TTTGGGAGCG TTCCAGTC3GG GATGGAGAAG AACTAGAAAG ACTTACXIAAA CCCAAGAGTG 
ATCAGnCAGA TGAAGASACT TTCTAACTGG TACCCACATA GTTTGCAGCT CTCTTGAACC 
TTATTTICAC ATTTTCAGTG TTTGTAATAT TTATCTTTTC ACTTTGATAA ACCAGAAATG 
TTTCTAAATC CTAATATTCT TTGCATATAT CTAGCEACTC CCTAAATGGT TCCATCCAAG 
GCTTAGACTA CCCAAAGGCT AAGAAAIXTCT AAAGAACTGA TACAQGAGEA ACAATATGAA 
50 GAATTCATTA ATATCTCAGT ACTTGATAAA TCAGAAA£3TT ATATGTGCAG ATTATTTTCC 
TTCGCXnTCA AGCTTCCAAA AAACTTGTAA TAATCATOTT AGCTATAGCT TGTATATACA 
CATAGAGATC AATTTGCCAA ATATTCACAA TCATGTAGTT CTAGTTTACA TGCCAAAGTC 
TiaXTTTTT AACATTATAA AAGCTAGGTT GTCTCTTGAA rrrTGAGGCC CTAGAGATAG 
TCATTTTCCA AGTAAAGAGC AACGGGACCC TTTCTAAAAA CGTTGGrrTGA AGGACCTAAA 
60 TACCTCGCCA TACCATAGAT TTGGGATGAT GTAGTCTGTG CTAAATATTT TGCTGAAGAA 
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GCAGrrTTCTC AGACACAACA TCTCAGAATT 
TTITTGTAAT AATCTTTTGA TGTTTTAAAC 

5 

TGTATTTTAA GTCATTTAAA CAAGCCACGG 
AAAATCTTGA TGTCATTACT CCTGAATTAT 
10 TTTATTAGTr ACTAATTCAA GCTGTGACTA 
GCTTCAGAAT CATACCAGAT TGTCAGTGAA 
TTTCAAAAGG ATCACTTAGC AAACACATGT 

15 

CTCTAAAAAT AGAAAGACCA GTAATATATA 
AAGTGCATGG TATTTTTCAT GGrTATTTTGC 
20 AGTCAGGTGA TAGATGATAT TAAAAATTAG 
AGCTGGGTGA TGATAGAAGA GTGGGCTTTA 
ATACTGTAAA TATGAGCTTT ATOGTGTCAT 

25 

TTCTCCTAAG TTTCATQCAG ATGAATATAA 
ATATCCACAA TAATATGACT GGCAAGAATT 
30 AACCTAAAAA AAAAAAAAAA AAAAACTCGA 



432 

TTAATTTTTA GAAATTCATG GGAAATTOGA 2100 

ATTOGTTCCC TAGTCACCAT AGTTACCACT 2160 

TOGGGCTTTT TTCTCCTCAG TTTGAGGAGA 2220 

TACATTTTGG AGAATAAGAG GGCATTTTAT 2280 

rrGTATATCT TTCCAAGACT TGAAATGCTG 2340 

GCTGATGCCT AGGAACTTTT AAAGGGATCC 2400 

TGACTTTTAA CTGATGTATG AATATTAATA 2460 

AGTCACTTTA CAGTGCTACT TCACACTTAA 2520 

ATGCAGCCAG TTAACTCTCG TAGATAGAGA 2580 

CAAACAAAAG TGACTTGCTC AGGGrTCATQC 2640 

ACTOGCAGGC CTGTATGmT ACAGACTACC 2700 

TCTCAGAAAC TTATACATTT CTGCTCTCCT 2760 

GGTAATATAC TATTATATAA TTCATTTCTG 2820 

GGTOGAAATT TGTAATTAAA ATAATTATTA 2880 



35 (2) INFORMATION FOR SEQ ID NO: 180: 

(i) SBQtJENCE CHARAQTERISTICS : 

(A) LENGTH: 519 base pairs 

(B) TYPE: nucleic acid 
40 (C) STOANDECNESS: double 

(D) TOPQLOGlf: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 180: 

45 GOCACGAGCC CCAGGCCAGC CAGGGCCAGG CCTACTTTGG CCACCCTTAA ATTAGAATGT 60 

GGGCTTCAGGG GTCACAGAAA AGCCATTTCT CTGACCTAGT GTTTGGCGTC CGGGAACTCT 120 

GTGCCCAACC TTCAGACCCT GGCAGTCCTC ACTGAGGCCA TTGGCCCAGA GCCCGCCATC 180 

CCCCGARACC CCCGGGAGCC GCCTGTTGCC ACCnTCCACAC CTGCCACACC CTCTGCCGGG 240 

CCCCAGCCCC TCCCAACCGG GACCGTGCTG GTCCCTGGGG GTCCTGCCCC ACCTTGCCTT 300 

55 GQGGAGGCAT GGGCCCTCCT CCTCCCACCC TGCCGGCCGT CACTCACCTC TTGCTTCrGG 360 

TCCCCCAGGC CTAGCCCTTG GAAGGAGACA GGAGTCTAGG GAGGCTGAAG CCCACTCCCG 420 

GGGAGGCCCG TGCTCCTCCA GCCCCAGQGA CAGCAAGGAA AAGAGAAGAG AGCAGAGCAT 480 

60 



50 
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TTCATGGCTC TAATAAAAAA AAAAAAAAAA AAAACTCGA 519 



(2) INFORMftTIQN FOR SEQ ID bJO: 181: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 968 base pairs 
10 (B) TYPE: nucleic acid 

_ (C) STRANDEDNESS: double 

(D) TOPOUXTY: linear 



15 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 181: 

TCCCCTTGGG GCCGGAAAAA GCGQGGTTGG CCTGNCCATT GGTTNTCCAT GCCGCCCGCC 60 

CATCCCCCAG TACTAGCCTG CAGTCCCAAT GTAGCCCCTC CCTCYTCCMA GAGCCCYTCM 120 

20 AACCGCCCCG STCANTTGTG ATTTCAQGAG GATTTGATGA AGATGTTAAA GCGAAAGTGG 180 

AGAACCrrrcT CGGGATTTCC AGCCTGGAAA AAAOGGACCC TGTTAGGCAA GCACCCTGCA 240 
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(2) INFORMATION FOR SEQ ID NO: 182: 

55 (i) SEQUENCE CHARACTERISTICS: 

(A) LHJGTH: 1128 base pairs 

(B) TYPE: nxicleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

60 



300 



GCCCTCCCTG TCCCCTTCTT CCCCTCCCCT TCYCCCGCCC GTGGAGACAG CTGTTYTCAG 

CAGGGdCTC CGCAGGGAGG GGGCCGGCTC CTTCCCTGGC AGCAACATCC TTQCCCTTGT 360 

CACACAAGTC AGCCTCCATC TGCGCAGCTC TGTQGATGCG CTGCTGGAGG GCAACAGGTA 420 

30 TGPCACTCGC TGCmCAGCC CCTACCACCG CCAGCGGAAG CTCATCCACC CGGTCATGGT 480 

TCAGCACATC CAGCCCGCAG CGCTCAGCCT CCTGGCACAG TQGAGCACCC TCGTGCAGGA 540 
GCTGGAGGCT GCCCTCCAGC TQGCTTTCTA CCCGGATGCC GTGGAGGAGT GGCTGGAGGA 



600 



AAACCTTCCAC CCCAGCCTGC AGCGGCTGCA ARCTCTGCTG CAGGACCTCA GCGAGGTGTC 660 



720 



TGCCCCCCCG CTGCCACCCA CCAGCCCTQG CAGGGACGTT GCTCAGGACC CCTGAGGGGA 
40 GAGCTCATGC CAGGGGGCTC CTGCTGGAGG CTGGGGGGGC TCTGCWYTKY CWWWPGGCCT 780 
GGGCAATACG GCCCACGrTCG GCGTCGTGCC CTCTGGCCCA GCAGTGTCTr GCCCACACTC 840 
AGTTCCIGAG GQCCCTGGGC AQCCCCTGGG GGAGAGACTA GAAAACACAG AAGGAAGCAG 
CACAGGGAGA CCCGCrTTGT GATCTGCATG TGTGACACTG ATTCTTTGGA AATAAAGAGT 
GGAAGCTG 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 182: 
TCTAAAACTT ATCACTAATC CTAATTCTTT TCCrGGCTTTT TCCTTTTGTC ACTTATTAAT 60 
CAGTTTTTCA AAGGACGAAT GAATTTAGAG ATGTACTCTG GAGCAGTATC ATGTTAAACC 120 
AGGGCTTATAT TAGAAAAATC ATCCTCATAA TCATTCTGGG AAGTTTTrCC TCCCCAAAAA 180 
AAGCCATCCr GATGGGrrTTT CAAAACCAGA AAAAAGCTCT TAATGAGGAA CAGACCACTG 
GAOTACCCAT GAGCATCTCA GGAAAACTGA GACCCTCGAG AAGCCTTGAT TTCGTGCAAC ' 
CCCCAAGGTT TCAGAGCCAG CAGCCCAGTG CTGTGGTTGA CAGACGTGGrT TTTEeTGGRGA 
15 AAGCAGCCAG AGGCCAGGAA TTTTCAGAGT CGTGAGTCAC GRTVTCCCAC CCAAGATTAG 420 
AGCAMAGATT AGCCATACTG AGATTTGGTA AAATCATICT GTCTAAGCAA TGGAGGTGTC 480 
TGCAMACGTC CAffTGCCTCT TCACAGGGGA TGCAGGCAGA TCSYGGGrTTT AGGATOGGOl 540 
AGGCCACCGC ACCCCCYTTC AYTGCTCTGC ACCTGCTCCC TCACGTGGAC ACrGTCCACA 
ACTGTGGCTC TCACAGGACA GTTGCCCAAG GAGCTCATAT CTTATTQGAG ATAGGGGGTC 
25 GTACAGCjrcA CAITCATCAG CAGTCrTGAGC OGGGTGACAT GGGQGTGTCA ACCCAGCATC 720 
TOTCCAGGAG CTCCTCCTGC AGCGGCTCTG GCAGGTGGCC TGAGGCTCCT TITTGAGAGA 780 

GAAcrcnrc gccttcctgt ctcctctcct ctgatctgtt ctttcttgga acaccaccca 

AGAACGTCAC CTCCTCCATC AGATTGTGAG CTCCTGGAGG GCAGGAGCTG TGTCCTTCTA 
TTCATCITCC TATCCCCAGA ACCTTGCACA GATCCTQGAA TGTGGTAGOr GCTCAGTAAA 
35 TOTGKTrrGA ATAAATCAAT GAATGAATGA ACAAATGAAT GAATTTOCTT ACTTCAAGGC 

AAAAGAACCA TCAAACTGTA TTTPGAGrTTT CTATGrTTATA GCAGTCAGCA AATCCTATTA 1080 
AATACTTTCT GTTTCCAAGC AAAAAAAAAA AAAAAAAAAA AAACTCGA 1128 

40 
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(2) INFORMATION FOR SBQ ID NO: 183: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2276 base pairs 

(B) TYPE: nucleic acid 

(C) STEIANDEDNESS : double 
50 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 183: 
CCGCGGCCTC TGACCTCATG GOGTAGAGCC TAGCAACAGC QCAGGCTCCC AGCCGAGTCC 60 
CTTATGGCCG CTGCCGTCCC GAAGAGGATG AGGGGGCCAG CACAAGCGAA ACTGCTGCCC 120 
GGOTCGGCCA TCCAAGCCCT TGrTGGGGTTG GCGCQGCCGC TGGTCTTGGC GCTCCTGCTT 



180 



60 CTtyrCCOCCG CTCTATCCAG TGnrGTATCA CGGACTGATT CACCGAGCCC AACCGTACTC 
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AACTCACATA TTTCTACCCC AAATGTPGAAT GCTTTAACAC ATGAAAACCA AACCAAACCT 
TCTATTTCCX: AAATCAGCAC OVCCCTCCCT CCCACGACGA GTACCAAGAA AAGTGGAGGA 

5 

GCATCTCTGG TCCCTCATCC CTCGCCTACT CCTCTGTCTC AAGAGGAAGC TGATAACAAT 
GAAGATCCTA OTATAGAGGA GGAGGATCTT CTCATGCTGA ACAGTTCTCC ATCCACAGCC 
10 AAAGACACTC TAGACAATGG CGATTATGGA GAACCAGACT ATGACTGGAC CACGGGCCCC 
AGGGACGACG ACGACTTCTGA TCACACCTTG GAAGAAAACA GGGGTTACAT C3GAAATTGAA 
CACTICAGrGA AATCTTTTAA GATC3CCATCC TCAAATATAG AAGAGGAAGA CAGCCATTTC 
TTTTTrCATC TTATTATITr TOrTTTTTGC ATTGCTCTPG TTTACATTAC ATATCACAAC 
AAAAGGAAGA TTTTrcrrCT GGTTCAAAGC AGGAAATGGC GTGATGCXXT TTGTTCCAAA 
ACACJTOGAAT ACCATpGCCT AGATCAGAAT GTTAATGAGG CAATGCCTTC TTTGAAGATT 
ACCAATCATT ATATTTTTTA AAGCACTGTG ATTTGAATTT GCTTATGTAA TTTTATTTGC 
TTCACrmT ATATGATATT GTOCAAATGT TTOXATAGG CAAOTGCSTAC TTAAATGAGA 
GC?rGAG?rCTC TCTTTTOCCT TQC3TGCTTTG GAAATTAAAT GTCACAAAOG AGTATATAAT 
TTTTTATCTC TACTTTTAGA GCTGAGrTTTA A3CAGGTGTC CAAAATGTGA CTTTAAACATT 
ACCTTATATT TACACTGTTA GTTrTTATTG TTTTAGATTT ATTATGCTTC TTCTOGAAGT 
ATTAGTCATG CTACTTTTAA AAGATCCCAA ACTTGTAACT AAATTCTGAC ATATCTGTTA 
CTCCIGACTC ACATTCATTC TCCX5CCATTC AAATACTATT TTTTATCCAC ATTTTTTTTT 
GITCCCAAAC TCTAATGTAC AAGGATATGT GTGATAATGC TTTGGATrTG AGTAATATTT 

rirrrrcTTC caagaaaact gctttggata tttttagata atttaaacat aatttaggat 

40 AATCATATTG CTCAATCTGA CCACAATTTT AGGTAAAACA TrAAATGTOT CAGAAATCTT 
GGCAACAGAG ACTCTCCAGC TTGCAGTGGA CATAGATAAA ATCTTACAGA GATACTATTT 
TTTTCGTTGG AATTACTATA TTAAATTTAG AAGCAGAAAC TGGTAAAATG TTAAATACAT 

45 

CTACAATTCC TTITAGTrAG CAATTGAaTC TAGCATOSGT TCCTCCAAGG TTTCAAGCAA 
TGGGCAGACT TEAAAA3TAT ATCAGATTCG TTTACTTCGT TrArrAlTTr ACAGTAAATT 
50 TCAATAAATC TTAGGGGTCA TTATCACITA AATAATACTG TftCCTAGC?rC TTTCAAATTA 
AAATTATACC TCAATCAAGT TGTTTGTATA CATAAAOGAT ATTTOTGrTAC AATTACCTTT 
TTTCCCCCAC ACironTTC mtJiTri TG TTTTTTATGG CAACTGGAAA GTATTTACTA 

55 

TCGGATICAT TTATGTCTGT CTTTCTATCA TAAAGAATTG ATCAATATGTT AAATA3CTGA 
TTIGAACCAT GGTITCACTTA CAAGTCTCAC TACAGCrTTT TAGAAAACAT AGCCCTAAITA 
60 TATOTTAAGC AQGACCCGGG TGAGCXIAGTG GGCTTGOGCT TTATCTAGAG CTQGAAGAAG 
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GCCGTCCATC CTOrCTCTTG GGCGGACAGT GTACTTTCCT AATAGGGAAG C3GAAGCACAA 2100 

TGGAAATACC CCTGAACCCT TTTATTGCAG TAATTITrTT CATATCIGAA ACTATTATTr 2160 

AATATTTTGA ATAAGAirTT AAAAAATAAA TGGCAAAGAT ATAAATCTAA AAAAAAAAAA 2220 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAA 

(2) INFORMATION FOR SEQ ID NO: 184: 
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15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2500 base pairs 
<B) TYPE: nucleic acid 

(C) STOANDEDNESS: do\able 

(D) TOPOLOGY: linear 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 184: 
TCCAAGCTAC GCCACTCGGG CTGGCX30GTT GGGAGCGC3GA CTTGCAGAGCG TGGTCC3TGGC 
25 CCCGCCCGrG AGAAGAGCGA GGCOCAQGAG GGGCTGCCAT GGCCGGGCflG CAGTTCCAGT 

ACGATCACAG TGC5GAACACC TTCTTCTACT TCCTCACCTC CTTCGnGGOG CTCATCGTGA 180 
TCCCGGCGAC ATACTACCTC TGGCCCCGAG ATCAGAATGC CGAGCAAAIT CGATTAAAGA 240 
ATATCAGAAA AGTATATCGA AGGTGTATGT GC3TACGTTTA aSGTTATTAA AACCCCAGCC 300 
AAATATTArr CCTACAOTAA AGAAAATACT TCTGCTTCCA GGATC3C3GCAT TGTTCTTATr 360 

35 ccrrccATAT aaaotttcca aaacagaccg agaataccaa gaatacaatc cttatgaagt 

ATTAAATTTG GATCCTCGAG CCAGAGTAGC AGAAATTAAA AAACAATATC GTrTC3CTGTC 
ACTTAAATAT CATCCAGATA AAGGAGGTGA TGAGCTTTATC TTCATGAGGA TAGCAAAAGC 540 
TEA3GCTOCT TTAACGGATG AAGAGTCCCG GAAAAATTGG GAAGAATTTG GAAATCCAGA 
TCC3GCCTCAA GCCACAAGCT TTCGAATTGC CCTGCCftGCT TGGATAGTTG ACCAGAAAAA 
45 TTCAKETCTG OTTTTACITC TATATC3GAIT GGCATTTATG GrrTATCCTTC CftGTTGTTGT 
GGGCTCTTOG TC3OTATCGCT CAATACGCTA TAGTCGAGAC CAGATTCTAA TACGSACAAC 
ACAGATTTAT ACATRClTre TTTATAAAAC CCGAAATATG GATATGAAAC GTCTTATCAT 840 
GGmrOGST GGAGCTTCTG AATTTGATCC TCAGTATAAT AAAGATGCCA CAAGCftGACC 900 
AACGGATAAT ATTCTAATAC CACAGCTAAT CAGftGAAATT GGCAGCATTA ATTTAAAGAA 960 
55 GAATCAGCCr CCACITACCT GCCCATATAG CCTGAAQGCC AGAGTTCnT TACTGTCTCA 1020 
TCTTGCTAGA ATCAAAATTC CTGAGACCCT TGAAGAAGAT CAGCAATTCA TGCTAAAAAA 
GICTCCTOCC CTACTTCAAG AAATGGTTAA TGTAATCTGC CAACTAATAG TAATGGCCCG 

60 
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GAACCaiGAA GAAAGGGAGT TTCGrTGCTCC AACTTTGGCA TCCCTAGAAA ACTGCATGAA 
GCTTTCTCAG A'TOGCCCnTC AOGGACTTCA GCAATTTAAG TCTCCCCTTC TCCAGCTCCC 
TCATATTCAA GAGGACAATC TTAGACGGGT TTCTAATCAT AAGAAGTATA AAATTAAAAC 
TATCCAGGAT TrcGTGAGTT TAAAAGAATC AGATCGTCAC ACTCTACTGC ACTTCCTTGA 
AGATGAAAAA TATGAAGAGG TTATGGCTGT CCTTQGGAGT TTTCCATATG TGACCATGGA 
TATAAAATCA GAGGTGT-rAG ATGATGAAGA TAGCAACAAC ATCACAGTAG GATCCTTAGT 
TACAGICTTG GlTAACyrTGA CAAGGCAAAC AATGGCTGAA GTATTTGAAA AGGAGCAGTC 
CATCTGTGCT GCAGAGGAAC AGCCAGCAGA AGATGGGCAG GGTGAAACTA ACAAGAACAG 
GACAAAAGGA GGATGGCAAC AGAAGAGTAA AQGACCCAAG AAAACTGCTA AATCAAAAAA 
AAAGAAACCr TTAAAAAAAA AACCTACACC TGTGCTATTA CCACAGTCAA AGCAACAGAA 
ACAAAAGCAG GCAAATGGAG TCGTTGGGAA TGAAGCTGCA GTAAAGGAAG ATGAAGAAGA 
ACymCAGAT AAGGGCAGTG ATTCTGAAGA AGAAGAAACC AATAGAGATT GCCAAAGTGA 
GAAAGATCAT GGTAGTGACA GAGACTCTGA TAGAGAGCAA GATGAAAAAC AAAACAAAGA 
TGATGAAGCA GAGTCGCAAG AATTACAACA AAGCATACAG CX5AAAAGAGA GAGCTCTATT 
GGAAACCAAA TCAAAAATAA CACATCCTCT GTATAGCCTT TACTTTCXTTG AGGAAAAACA 
AGAA1GC?rGG TOGCTTTACA TTGCAGATAG GAAGGAGCAG ACATTAATAT CCATGCCATA 
TCATCTOTGr ACGCTGAAAG ATACAGAGGA GGTAGAGCTG AAGTTTCCTG CACCAQGCAA 
GCCTGGAAAT TATCACrTATA CTGTGrrrrCT GAGATCAGAC TCCTATATGG GrTTGGATCA 
GATTAAACCA TTCGAAGTTK GGAAOTCAT GAGGCTGAAG CCTGTGCCAG AAAATCACCC 
ACACnCGGAT ACAGCAATAG AQGGGGATGA AGACCAGGAG GACAGTGAGG GCTTTGAAGA 
TAGCTITCAG GGAGGAAGAG GGAQGGAGGA AGGAAGGTIGG TGGACTTAAG GCAGTTACTC 
TOGAATGGGA CCCACAGICT TTTGCACCAT ATTTrGGCAA TTTTTTTTGC CCGTrrmiG 
GAAGTGTITT CCNINAANCC CAGGAACCAT TACAGAACCG 
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50 (2) INFORMATION FOR SEQ ID NO: 185: 

(i) SEQUENCE C3JARACTERISTICS : 

(A) LENGTH: 1337 base pairs 

(B) TYPE: nucleic acid 
55 (C) STRANDEENESS : do\jble 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 185: 
60 CirCCGCTTC TCCGGGCAGC TGCCACTGCT GTAGCTTCTG CCACCTGCCA CGACCGGGCC 60 
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ICTCCXnGGC GTTTOGTCAC CTCTGCTTCA TTCTCCflCCG OGCCTATGGT CCCTCTTOGA 
GCCAGCGTGG CGGGCCTCGC GGCTCCCGGG TGGTGAGAGA GCGGTCCGC5G AACGATGAAG 
GCCTCGCAGT GCTGCTCCTG TCTCAGCCAC CTCTTGGCTT CCGTCCTCCT CCTGCTCTTG 
CTOCCTGAAC TAAGCGGGYC CCTC3GMAGTC CTCCTGCAGG CAGCCGAGGC CGCGCCAGGT 
CITCGGCCTC CTCACCCTAG ACCACGGACA TTACCGCCGC TGCCAC0GC3G CXTTACCCCT 
GCCCAGCAGC CGGGCCGTOG TCTOGCTGAA GCTC3CGGGGC CGCGGC3GCTC CGAQGGAGGC 
AATOGCAGCA ACCCTGTGGC CQGGCTTGAG ACGGACGATC ACGGAGGGAA GGCCGOGGAA 
GGCTCGGTOG GTrGGOGGCCT TGCTGTGAGC CCCAACCCTG GCGACAAGCC CATGACCCAG 
CX3GGCCCTCA CCXTTCnTCAT GGTOC?rGftGC GGCGCGGTCC TQC?rGTACTT CGTOGTCAGG 
ACGGICAGGA TGAGAAGAAG AAACCGAAAG ACTAGGAGAT ATOGAGTriT GGACACTAAC 
ATAGAAAATA TOGAATTGAC ACCTTTAGAA CAGGATGATG AGGATGATGA CAACACGTTG 
TTTCATCCCA ATCATCCTCG AAGATAAGAA TGTGCCTTTT GATGAAAGAA CrTTATCITr 
CTACAATCAA GAGrTOGAATT TCTATGTTTA AGGAATAAGA AGCCACTATA TCAATGTTGG 
GGGGCTATTT AAGITACATA TATITTAACA ACCTTTAATT TGCTGITGCA ATAAATACCG 
TATCCTTTTA TTATATCTTT ATATGTATAG AAGrTACTCTR TTAATGGGCT CAGAGATGTT 
GGGGATAAAG TATACTOTAA TAATTTATCT GTTTGAAAAT TACTATAAAA CGGTGTTTTC 
TCATCGCTTT TTCnTTCCTG CTTACCATAT GATTGTAAAT TGTTTTATGT ATTAATCAGT 
TAATOOTAAT TATTTTTCCT GATCrrCATAT GTTAAAGAGC TATAAATTCC AACAACCAAC 
TGC7ICTCTAA AAATAATTTA AAAfrTCCTT TACTGAAAGG TATTTCCCAT TmCTGGGG 
AAAAGAAGCC AAATTTATTA C lTlUlXJ n G GGGTmTAA AATATTAAGA AATCTCTAAG 
TTATTOITrc CAAAACAATA AATATGATTT TAAATTCTCT TAAAAAAAAA AAAAAAAACC 
CCGGGGQGGG GCCXX3GN 



120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1337 



50 



60 



(2) INFX>RMATiaN FOR SEQ ID NO: 186: 



(i) SBQOQJCE CHARACTEEIISTICS: 

(A) LENGrrH: 941 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
55 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTIC»I: SEQ ID NO: 186: 
GGCACGAGCC TOGACGCAGC AGCCACCGCC GCGTCCCTCT CTCCACGAGG CrGCCGGCTT 60 
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(2) INFORMATION FOR SEQ ID NO: 187: 



439 



AGGACCCCCA GCTCCGACAT GTCGCCCTCT GGTCGCCTGT GTCTTCTCAC CATCGTTGGC 120 

CTGATTCTCC CCACCAGAGG ACAGACCTTTG AAAGATACCA CCTTCCAGrTTC TTCAGCAGAC 180 

TCAACTATCA IGGACATTCA GGTCCCGACA CGAGCCXXIAG ATGCAGTCTA CACAGAACTC 240 

CAGCCCACCT CTCCAACCCC AACCTGGCCT GCTGATGAAA CACCACAACC CCAGACCCAG 300 
ACCCAGCAAC TGGAAGGAAC GGATGGGCCT CTAGTGACAG AOXXAGAGAC ACACAAGAGC . 360 

ACCAAAGCAG CTCATCCCAC TGATGACACC ACGACGCTCT CTGAGAGACC ATCCOCAAGC 420 

ACAGACGTCC AGACAGACCC CCAGACCCTC AAGCCATCTG GTrTTCATGA GGATGACCCC 480 

15 TTCTTCTATG ATGAACACAC CCTCCGGAAA OGGGGGCTGT TOGTCGCAGC TGTGCTGTTC 540 

ATCACAGGCA TCATCATCCT CACCAGTGGC AAGTGCAGGC AGCTGTCCOG GTTATGCCXX; 600 

AATCATTGCA GGTGAGTCCA TCAGAAACAG GAGCTGACAA CXYGCTGGGC ACCCGAAGAC 660 

20 

. CAAGCCCCCT GCCAGCTCAC CGrTGCCCAGC CTCCTOCATC CCCTCGAAGA GCCTGGCCAG 720 

AGAGGGAAGA CACAGATGAT GAAGCTGGAG CCAGGGCTGC CGGTICCXSAGT CTCCTACCTC 780 

25 CCCCAACCCT GCCCGCCCCT GAAGGCTACC TGGCGCCTTG GGGGCTGTCC CTCAAGTTAT 840 

CTCCrCTOTT AAGACAAAAA GTAAAGCACT GrrGGTCTTTG CAAAAAAAAA AAAAAAAAAA 900 

AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAACTCG A 941 

30 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 654 base pairs 

(B) TYPE: ruicleic acid 

(C) STRANDEDNESS : double 
40 (D) TOPOIjOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 187: 

QAATTCGGCA CGAGGCAGCT TGTGCTTTAA AGGAGGTOTT CAAAGCATGT CTGAGCAGAG 60 

ACTTTTGGGC TCTGTTTTAA TTAATACTTT AAAATAATTC ATATTTAAAA TATCARATGT 120 

TTCCATAAAG AGGAGGATGT TTAAATGCCT CCAGACTACA TTCCTTTTTA TTSCTTGATT 180 

50 TTACCTQGGA GnCCAAAGTT CAATTCCCAT AAAGCAAGCG TTTTATTTGrT CAdTTCAAT 240 

ATACATCCGA TTGCCATGCT TAAGATGCAA TATGGGCTGC GGAAATAQGT TAACCCACAG 300 

GCTCCCAGGG CCCAGTGTAG AAQGTGAGAG ATTCGTGTAA AATGATTCAA ATAAAAGGAA 360 

GACCCTGGCC GGGTGCCGTA RCTCACGCCT GTAATCCCAG CACTTTGGGA GGCCGAAGCG 420 

ACTIGGATGAC GAGOTTAGGA GTITOGAGACC AGCCTGGCCA ACATCGTGAA ACCCCGTCTC 480 

60 TACTAAAAAT ACAAAAATTA GCCQGGCATG GTGGCAGGCA CCTC5TAATCC TAGCTAGTTG 540 
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GGAGGCTGAG GCAGGAGAAT CGrriTGAATC TGGGAOrTGG AGGTTGTCAG TGAGCTGAGA 



600 



10 



20 



30 



40 



50 



TCG0C5CCACA GCACTCCAGC CTGGGTGACA GGGTGAGACT CTGTCTCAAA NAGA 654 



(2) INFORMATION FOR SEQ ID NO: 188 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1848 base pairs 

(B) TYPE: nucleic acid 

' (C) STRANDEDNESS: double 
15 (D) TOPOLOGY: linear 



(xi) SE^^JENCE DESCRIPTION: SEQ ID NO: 188: 

GAAACTGGAC CGGAGAACCG GAGCGAAGCG AAGOGGAAGC CCGGAATGAG GCCGGACTGG 60 

AAAGCCGGAG CGGGGCCAGG CGQGCCTCCC CAAAAGCCTG CCCCTTCATC CCAGCGGAAA 120 

CCGCCGGCCC GGCCGAGCGC GGCGGCOGCT GCGATTGCAG TCGCGGCGGC GGAGGAAGAG 180 



25 AGACGGCTCC GGCAGCGGAA CCGCCTGAGG CTGGAGGAGG ACAAACCGGC CGTGGAGCGG 240 



300 
360 



600 
660 



TGCTTGGAGG AGCTGGTCTT CGGCGACGTC GAGAACGACG AGGACGCGTT GCTGCGGCGT 
CTGCGAGGCC CGAGGGTTCA AGAACATGAA GACTCGGGTG ACTCAGAAGT GGAGAATGAA 
GCAAAAGCTA ATTTTCCACC TCAAAAGAAG CCA CJriTGG G TGGATGAAGA AGATGAAGAT 420 
GAGGAAATGG TTCACATCAT GAACAATCGG TTTCGGAAGG ATATGATGAA AAATGCTAGT 480 
35 GAAAOTAAAC TTTCGAAAGA CAACCTTAAA AAGAGACTTA AAGAAGAATT CCAACATGCC 540 
ATCGGAGGAG TACCTGCCTG GGCAGAGACT ACTAAGCGGA AAACATCTTC AGATGATGAA 
AGTGAAGAGG ATGAAGATCA TTTGTTGCAA AGGACTGGGA ATTTCATATC CACATCAACT 
TCTCTTCCAA GAGGCATCTT GAAGATGAAG AACTQCCAGC ATGCGAATGC TGAACGTCCT 720 
ACPCnTCCTC GGATCTCCAT CTGTGCAGTT CCATCCCGGT GCACAGATTG TGATOGTTGC 780 
45 TGGGATTAGA TAATCCTGTA TCACTATTTC AGGTTGATGG GAAAACAAAT CCTAAAATTC 840 
AGAGCATCTA TTTOGAAAGG TTTCCAATCT TTAAGGCTTG TTTTAGTGCT AATGGGGAAG 
AACTTTTAGC CACGAGTACC CACAGCAAQG TTCTTTATGT CTATGACATG CTGGCTQGAA 
AGTTAATTCC TOTGCATCAA GrTGAGAGGTT TGAAAGAGAA GATAGTGAGG AGCTTTGAAG 
TCrcCCCAGA TQGGTCCTTC TTGCTCATAA ATGGCATTGC TGGATATTTG CATTTGCTAG 
55 CAATCAAGAC CAAAGAACTG ATTGGAAGCA TGAAAATTAA TGGAAGGGTT GCAGCATCCA 1140 
CATrCTCTTC AGATAGTAAG AAAGTATACG CCTCTTOGGG GGATGGAGAA GTTTATGmT 1200 
GGGATOIGAA CTCAAGGAAG TGCCTTAACA GATTrGTTGA TGAAGGCAGT TTATATGGAT 1260 

60 



900 
960 
1020 
1080 
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TAftGCATTGC CACATCTAGG AATQGACAGT imnTCCTTG TGGTTCTAAT TGrTQGAGTGG 1320 

TAAATATATA CAATCAAGAT TCTTGTCTCC AAGAAACAAA CCCAAAGCCA ATAAAA13CTA 1380 

TAATCAACTT GCTTTACAOGT GTrACTTCTC TGACCTTCAA TCCTACTACA GAAATCTTGG 1440 

CAATTGCrrC AGAAAAAATG AAAGAAGCAG TCAGATTGGT TCATCTTCCT TCCTGTACAG 1500 
TATTITCAAA CTTCCCAOTC ATTAAAAATA AGAATATTTC TCATGTTCAT ACCATGGATT _ 1560 

TITCTCCGAG AAffTOGATAC TTIX3CCTTGG QGAATGAAAA GGGCAAGGCX: CTGATGTATA 1620 

GGTTOCACCA TTACTCAGAC TTCTAAAGAG ACTATTTGAA GTCCAGTrGA GTCACAAGAG 1680 

15 AAGCCTGrrCT TCATATATCA TCTCAGAAAC TTTCCTGAAT ATCTTGATAAT ATATGGAAAA 1740 

TCATTTATAG ATCCAGCTCT GCTTAAGAGC CAGTAATGTC TTAATAAACA TGrTGGCAGCT 1800 

nrcnTCAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAACTCGA 1848 



10 



20 



25 



45 



55 



(2) INFX>RMAriON FOR SEQ ID NO: 189 



240 
300 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1146 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: double 
30 (D) TOPOLOGY: linear 

(xi> SEQUENCE DESCRIPTXC»4: SEQ ID NO: 189: 

AAAAAAAACC CAGGGGAACN TTOGGGGCCG CTrTONNTTC CCCCTCCAGG CCATTGGGGA 60 

35 

ATTCTTCAAG TTAATCCTGC TTTOCTCTTG GCCAACAGGG CTTCTAGGGG GGAGAGACCC 120 
AGGATCATCA AGGGCTTTCGA GTGCAAGCCT CACTCCCAGC CCTGGCAGGC AGCCCTGTTC 180 
40 GAGAAGACGC GGCTACTCTG TGGGGCGACG CTCATCGCCC CCAGATGGCT CCTGACAGCA 
GCCCACTGCC TCAAGCCCCG CTACATAGTT CACCTGGGGC AGCACAACCr CCAGAAGGAG 
GAGGGCTOTG AGCAGACCCG GACAGCCACT GAGTCCTTCC CCCACCCCGG CTTCAACAAC 360 
AGCCTCCCCA ACAAAGACCA CCGCAATGAC ATCATGCTGG TGAAGATGGC ATCGCCAGTC 420 
TCCATCACCT GGGCPC?rcCG' ACCCCTCACC CTCTCCTCAC GCTGTGTCAC TGCTGGCACC 480 
50 AGCrTCYCTCA TTrCCGGCTG GGGCACaiACG TCCAGCCCCC AGTTACGCCT GCCTCACACC 540 

tigsgatocg ccaacatcac catcattgag caccagaagt gtgagaacgc ctaccccggc 

AACATCACAG ACACCATGGT GTGTGCCAGC GTGCAGGAAG GGGGCAAGGA CTCCTQCCAG 
GGTGACrcCG GGOGCCCTCT GGTCTGTAAC CAGTCTCTTC AAGGCATTAT CTCCTQGGGC 720 
CAGGATCCCT GTCCGATCAC CCGAAAGCCT GGTGTCTACA CGAAAGTCTG CAAATATGTG 780 
60 GACTOGATCC AGGAGACGAT GAAGAACAAT TAGACTGGAC CCACCCACCA CAGCCCATCA 



600 
660 



840 
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5 



CCCTCCATTT CCACTTGGrTG TTTC3GTTCCT GTTCACTCTG TTAATAAGAA ACCCTAAGCC 900 
AAGACCCTCT ACGAACAITC TTTOGGCCTC CTGGACTACA GGAGATGCTG TCACTTAATA 960 
ATCAACCTGG GCTTCGAAAT CAGTGAGACC TGGATTCAAA TTCTGCCrTG AAATATTGTG 1020 
ACTCTGGGAA TCACAACACC TOGTITGTTC TCTGTTGrrAT CCCCAGCXXX: AAAGACAGCT 1080 
10 CCTGGCCATA TATCAAGGTT TCAATAAATA TTTGCTAAAT GAAAAARAAA AAAAAAAAAA 1140 

1146 



15 



30 



ACrCGA 



50 



(2) INFORMATION FOR SEQ ID NO: 190: 



(i) SBQOENCE CHAKACTERISnCS: 
20 (A) IiENGTH: 906 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

25 (xi) SEQUENCE DESCRIPTICKJ: SEQ ID NO: 190: 

ACTCCCTCAC CCAGGTCCCA GCCCTGGGAA CCACCTACCG TGAGCCCITT TGCAGATATA 60 

GACTCATTTC ATCCTCAGAT GGTCCTTCAA GGTAGGTACT TTAGTCCCAT TTTAGAGATG 120 

AGACGATTGA GGCCAGAGGG GrTGNNGTEAAC TTGCCTGOGG GCTCACGAGC ACAAAftGGAG 180 

CCGAGC3CAGG ATCTCACCCT TGTTCTCTGG CCTCACTGCC CTCACTTTGC CATGACCCGA 240 

35 AOTTATCTCC CTACAAAGCA ATGCATGGTC CAM3GVTCTT TTTATICTAT TTTTATTTTT 300 

AAGGGTCCTG TTCAAAACTG GTGrTGAGCTC TGAGGAGTCC TGAACCCTOG GTC3CAGCATC 360 

CTAGCATCCT GGGAGTCCTT TTCTOCCCAC ACTGAGCTGG CXTCCTCGAG GGGTCQGGCT 420 

40 

GCTCICCCIG GAAGCCTGGC AGCAGCACTG TATOGGGTTG GCTGAAGCTG ARCGCCGTGG 480 

GCTQCAGGGC TCCMGGAATC CCCCTTTTGCX: TGAAGGGGTT CXTTCTAGCC MDGGATGTTT 540 

45 ATCAGGTCTC TCTGATGCCC CAGGCGCAGG ACATGTOTOC GGGTGGAGAA AAGCAGC3CCC 600 

TTTCAGTOCC AGCTCCACTC AATTTCTATG TGGACCAAGA ACGATAAACT TAAAAAATTT 660 

r irr rccr A A gctatcttca gaatatqgtg tatttttatg tggaaaagaa aagttatgaa 720 

GGCAOCICTT ACriTAftGAG AAAATTCATT AAAAGTCCTC GAGGTATGAA GATGACGGCG 780 
TGCITCTCAA TCATTITOGC ATAACTTGAT TGTGGCTGTA ATTrmTIT 'iTrmTlCT 840 
55 CAAGCATCTC AGftCAATAAA GTCTTTGTAA AAAGRGAAAA AAAAAAAAAA AAAAAAAAAA 
ACTCGA 

60 



900 
906 



wo 98/54963 



PCTAJS98/11422 



443 



15 



20 



25 
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(2) INFORMATION FOR SEQ ID NO: 191: 

(i) SEQUEL CHARACTERISTICS: 
5 (A) LENGTH: 1941 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: doiable 

(D) TOPOLOGY: linear 

10 (xi) SEQUENCE DESCRIPTION: SBQ ID NO: 191: 

CnCAGCTGA AGCCCAGGGA CCCCTTTTCC ACCCTQGGCC CCAATGCCGT CCTTTCCCCG 
CAGAGACK3G TCITCGAAAC CCTCAGCAAA CTCAGCATCC AGGACAACAA TC?rGGACCTG 
ATTCTCGCCA CACCCCCCTT CAGCCGCCTG GAGAAGTTGT ATAGCACTAT GGTGCGCTTC 
CTCACTGACC GAAAGAACCC GGTGTOCCGG AGATGGCTGT GGTACTC3CTG GCCAACCTGG 
CTCAQGGGGA CAGCCTOC5CA GCTCC5TGCCA TTGCAGTGCA GAAGGGCAGT ATCQGCAACC 
TCCraSGCTT CCTAGAGGAC AGCCTTQCCG CCACACAGTT CCAGCAGAGC CAGGCCAGCC 
TCCTCCACAT GCAGAACCCA CCCTTTGAGC CAAYTAGTrGT GGACATGATG CGGCGGGCTG 
CCCGCGCGCT GCTTCCCTTC GCCAAGGTGG ACGAGAACCA CTCAGAGTTT ACTCTCTACG 
AATCACGGCT CJITOGACATC TCGC3TATCAC OGTTGATGAA CTCAKTGGTT TCACAAGTCA 
TITGTCATOr ACTGrTTTTG NATTGCXXZAG TCATGACAGC CC3TGGGACAC CTCCCCCCCC 

cgrenCTT TC TGCGrcrrcrc gagaacttag aaactgactg TTGCCcrrrA tttatgcaaa 

ACCACCTCAG AATCCAGTTT ACCCTGTGCT GTCCAGCTTC TCCCTTGGGA AAAAGTCTCT 

ccrorrrcTC tctcctcctt ccacctcccc tccctccatc acctcacgcc TTTCTGrrcc 

TTCTCCrCAC CTTACTCCCC TCAGGACCCT ACCCCACCCT CTTTGAAAAG ACAAAGCTCT 
40 GCCTACATAG AAGACTTTTT TTATTTTAAC CAAACTTTACT GTTGrTTTACA GTGAGTTTGG 
GGAAAAAAAA TAAAATAAAA ATGGCTTTCC CAGTCCTTGC ATCAACGGGA TGCCACATTT 
CATAACKTIT TTTAATOGTA AAAAAAAAAA AAAAAAATAC AAAAAAAAAT TCTGAAGGAC 
AAAAAAGCHG ACTGCTGAAC TGTGTGTGGT TTATPGTPGT ACATTCACAA TCTTGCAGGA 
CSCCAAGAAGT TOXMTTCT GAACAGACCC TGITCACTGG AGAGGCCTGT GCAGTAGAGT 
50 GTAGACCCTT TCATOTACTG TACTCTACAC CIGftTACTGT AAACATACTG TAATAATAAT 
CTCTCACATG GAAACAGAAA AOGCTQGGTC AGCAGCAAGC TGTAGTTriT AAAAATGTTT 
TTACTTAAAC GTTGAGGAGA AAAAAAAAAA AGGCTTTTCC CCCAAAGTAT CATGTGTGAA 
CCTACAACAC CCTCACCTCT TTCTCTCCTC CTTGATrGTA TGAATAACCC TGAGATCACC 
TCTTAGAACr GGrTTTAACC TTTAGCTQCA GCOICTACGT CNAWCGOTGT GTATATATAT 

60 GACGrnCTAC attgcacata cccttggatc cccacagitk qgtcctcctc ccagctaccc 



35 



45 



55 



60 
120 
180 
240 
300 
360 
420 
430 
540 
600 
660 
720 
780 
840 
900 
960 
102O 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
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CXTTATACTA TGACGAGTTA ACAAGTTGGT GACCTGCACA AAGCXSAGACA CAGCTATTTA 

ATCTCTTGCC CAGATATCGC CXTTCTTOGT GCGATGCTGT ACAGGTCTCT C3TAAAAAGTC 

5 

CTTGCTOICT CAGCAGCCAA TCAACTTATA GTTTATTTTT TrCrGGGTIT TTGrTTTTGTT 

TKTrnTCTr TCTAATCGAG GTGTGAAAAA GTrCTAGGTT CAGTTGAAGT TCTGATGAAG 

10 AAACACAATT GftGATTTTTT CAGTGATAAA ATCTGCATAT TrcTATrTCA ACAATGTAGC 

TAAAACITCA 1CTAAATTCC TOrmTTTT CCTTTTTTGG CTTAATGAAT ATCATTTATT 

CAGTATCAAA TdTTATACT ATATGTTCCA CGTGTTAAGA ATAAATGTAC ATTAAATCTT 

GGTAAGACTT TAAAAAAAAA A 



15 



1560 
1620 
1680 
1740 
1800 
1860 
1920 
1941 



20 



25 



30 



35 



40 



45 



50 



55 



(2) INFORMATION FOR SEQ ID NO: 192: 

(i) SBQfUENCE CHARACTERISTICS: 

(A) LENGTH: 2118 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : doxible 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 192: 
AAATAATAAT AANAATAAAT AAAAATWAAG TGCTTAKTGT AACTCAGCGG ACAQGGCTCC 
CAGCTGCTCT GGCACGTGGG ACACCYTCCA CCCTGCACAC AACAGGCATG CAAAGAQGAC 
TGGATATCGT GGGGTAGAGT GCTTCTGGrTG TGTTCACTTT AAGAAAACAT CTGCCAAGAG 
AGAAGAGIGC CCAQGAAAGA CCAGGAAAAT ACAAGTACAT QGCTGCTTCA TACCATATAC 
CCCAATTCTT TAAAGCAGCA AAAGGCACTT TTTTTTTCAG GCCAGAGTGA ATCTAAAACA 
AACCTQQCTT TGCTTACAGG GAAGCTGTCC CAGAAGGACT GAGTGATGCC TCTTGTTCCC 
TAAGCTCTGG AGAGTrCTTTG CAAGTTTCCA ACGACATTTC CAACCAGGTG GGAGAGACCA 
GCACTTGACG AGACAAlOTCA GACCCAAAAA ACGACGCCAA GGTAGTGAGT GGGTGCCTAT 
TTCGGAffEAG GATCATTTCA GGAAAACAGG AAGAAAAACC QGTCAGAAAG TCGCACTTTG 
GAACTTOGAAA GCICTTTGCA AATAGCAACT CTGGCTAAAG CGAAAATGriT AATCAAGTAG 
AAA(?rAAAAT TCAGGAaXTTT AGAAGCTCAT CCTTCTGATG AGAACTATTT TmTTCCGT 
GAAGGAACTA TTATTACTTT AAAAGTGAGG GTAATTTACA TATGGGGICT ATATATTCTA 
AAAATACJTAA TAAAAGTACC TTTTATAAGC AATGTTGTGT GGCITCTAGA AGAAAGCAGG 
GAGGAAAAAA AGGCAGGCAA AACTAGTCTA GGTCTAGGCC CTAAAAATGA GCTTCCTTCC 
CACTTCACTC GAAACGCCCA TGTGATTTCT AGGCPGAAAA TAGGTAGGAT TTAACGAGrTA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
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10 



15 



20 



25 
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35 



40 



ACCTASTTCC CTTCrGTCTC TCATTTCTGA TCAGCTGATG GAGCTGCTAG TAAGAGGC3GC 
CGATCA'TCCT CCCAGACGAG TCCriTCGCC TCTTGCTCTC CATCCCAAO: CTGACTCCTT 
CAGCAGCAGC CCCCTCCTTC TOTCTCCATC TGATGCAGGC AAGCAGGAGC AGTAAGAGGG 
CATCCCATOT TCCAGTTCAC CITCTATQGG GTrGACTARGA GGTrCCCC5C?r AACTAGGC3CA 
GCCCAHGCCC AGCAGCTTGC AAAAGCAGCT GCAAGCTTCA GAAACCCACT TCCTCCAACA 
CCAGGGAGCTT GGCAGAGAGC CCATCCAAAA GCCCACTGGG AGAGGCATAA GATTCTGrEGC 
CAGGCCCCCA GOTCCCCTCT CHCTCAGGTA GGCTCTGCTA CTCGCCTCTG AAGTAAflGGC 
AAANACAAAC GC3GCAGGGCA C3GGTCGCAGG AATAAAAAAC TCTQGACAGA AACCCITTTA 
ATAAAGGAAA TTCCACCXXTT CCCAATCCTT CCATGGAAGG GTGAGACCIT AATGTCATGT 
AAGAGGAAGG TCTTCTCTGG CTITCAGGCSA AACAGCTGCA QCTGAAACrr AQGGGCXX3VT 
TCCAGGGCAC mTCACCAC AGCCAGTOCA GCCGCTCCAA GrTGCCACTGT CAGCCCCATC 
ACTGCXIAATT TCACAAAGCG CTK3OTCCTT QGCTTCSGTCA GGACATCETr TGTTCGATCT 
TCAGC5CCGCA GAAGTCCCCG AANACCGCTG CXXX:AGCACC ATATCAGC3CC TCTQCTOCSGC 
.TCATGCCAGC TCAAACnCIT TGAAAGTAGA GGCTGCCGTC CTCTCAGCTT GCTGTTCGGC 
AGCGGCCTCC CGAGCAAOTT CC5GATCGGGG AAACTGAACA AAAAGGTCTC CTSTCTGCTG 
ATCACTCTCT CATAGGGCAA C3TCCTGAQGG ATCTGGGACA ACAGGTGGTG GACCX5AGGCC 
ATOTCACACyr CACACTCCAG GACTTCCTGC TCGCGATACA ACACAATCAC GGCTGCAAAG 
TAAA3X:GGCA TCAGIGGGTG GCAGGCCAGG AAGAAGTCAT ATAACCGCAC GACGrroCCTG 
AAGTCAGACA C3GACATCCCC AAACCAGGTG ATGAGCCAGC TGAGGGCAAA GATGGTCCCT 
ACCTCAGCAC TCTGCATCAA GTCATC3GAGC TCTGGATTCA CXrTGGTCAAT GATGQGCATC 
AGATAGTTTA ATATATQC 



960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2118 



45 

(2) INFORMATION FOR SEQ ID NO: 193: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGOW: 1538 base pairs 
50 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOUOG^C: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 193: 
CCGCSOTTCGG CTCTOICTCA GCAGCCGGGC GC30C3CTCGGG CGGGACATC3G CAGCCTGTAC 
ACXXCGQCGG CCroCXXXTTG GGCAGCCGC^ 
60 GGCCAAGGCC CXTCTC?roCG CGGCCOIAC3C TCSGAGCCTTC TCGCCAGCC?r CGACCACGAC 
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240 
300 
360 
420 
480 
540 
600 
660 

AAAAGCACCT CCTTITOIGG CICGGGftGAC GCTAAGQGCC TGGCAAGAGA AGAATCACCC 720 
20 CTOQCTGGAG CTCTCCGATG TTCATCC3GGA AACAACTGAG AACATAOGTG TCACTGTCAT 
CCCCTTCTAC ATOGGCATGA GGGAAGCXXrA GAATTCCCAC GrrOTACTCGr GGCGCTACTG 
TATCCC?nTC GAGAACCrrc ACAGTCATOT GGTACAGCTC CGGGAGCXX3C ACTGGAGGAT 



GACX3CGGAGG CACCTCICGT CCCGAAACCG ACCAGAGGGC AAAGTGTTGG AGACAGrTTGG 
TOTTrTTCAG GTGCCAAAAC AGAATGGAAA ATATGAGACC GGGCAGCTTT TCCITCATAG 
^ CATTTITCGC TACCX3AGGTC TCGTCCTGTT TCCCTCGCAG GCCAGACICT RTGACCGGGA 
TCTGGCrrcT GCAGCrCCAG AAAAAGCAGA GAACCCTGCT GGCCATQGCT CCAAGGAGGT 
10 GAAAGGCAAA ACICACACTT ACTATCAGGT GCTGATTGAT GCTCGrTGACT GCCCACATAT 
ATCTCAGAGA TCTCAGACAG AAGCTGTGAC CTTCTTGGCT AACCATGATG ACAGTCGGGC 
CCrCTATCCC ATCCCAGGCT TGGACTATGT CAGCCATGAA GACATCCTCC CCTACACCTC 
CACTCATCAG CTTCCXIATCC AACATGAACT CTTTGAAAGA TTTCTTCrGT ATGACCAGAC 



25 

ATrcAcrrcrc tctcgcacct tggagacagt gcgaggccga GGGcrrAGrrcG gcagggaacc 



780 
840 
900 
960 

AOKJrrATCC AAGGAGCAGC CIGCCTTCCA CTATAGCAGC CACGTCTCGC TGCAGGCTTC 1020 
30 CACTGGGCAC ATOK3GGGCA COrTCCGCTT TGAAAGACCT GATGGCTCCC ACTTTGATGT 1080 

TCGGAarccT cccrrcrccc tggaaagcaa taaagatgag aagacaccac cctcaggcct ii40 

TCACTGCrrAG GCCAGCKSAG GCCCCAAGTC CXXy^GGCTTG GTCACOGGGA AGAACAACTC 



TCATCCCACA ATTCCTOCAG AACTCITCTC TCCCCATCKT GGGCCACAGT GGGrrCTCITA 



1200 
1260 

ATTTGATICT GGGGnCTTT TTOTOGGGAG GGGTGGTATA ACTITrCTTC AGAAGACCCA 1320 
40 TOFGGGACAC CTCCAAGGCT GGCCTCCTCA TAAGCCCTQC CTACACCATG TTCCAGTAAA 1380 

ccrarccACC aaggaacict cttcagctoc cacaggcctg gaggagtttc crGGCcrorc 1440 

AaJTCAGGrr tcatcactaa accagtgcas GYTTGGCCAA AAAAAAAAAA AAAAAAAAAA 1500 

45 1538 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAACTOGA 



50 

(2) INFORMATION FOR SEQ ID NO: 194: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1098 base pairs 
55 (B) TYPE: nucleic acid 

(C) STRANDECNESS: doxable 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTICN: SEQ ID NO: 194: 

60 
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AGACCCTCTC TCAAATAATA ATAATAATAA TAATCrTATT TTGGAGAATA AAGAGACCTS 60 

TGGATTTGAG CTTGCCATTTC GGTAGAAAGA AAAGACGTTT ACACCGAGAA ATAGTCTGTC 120 

TO5CCCTCAA GGAGCAGAGG GATGCATCGC TGGAGCTTGAC CTACAGnTGA AGAAGACTCA 180 

TTATGACAGA CCTTCTCCTT CTrCCTTGTG GAAAGTICTTT CCTCTGCTGC TACTGCTCAT 240 
GAGACrCTTC CCCCTOXTG TCCCAGGGAA CCAAAGGGCT TTKCTACCAC ACCXnTTCTT - 300 

NGCCCCCCGC CrcCCATCTC TCCTGTOCCT TTffTACTCAG CAATTCTTNS TTTGCTCCCA 360 

TTATCnCCA GCCGGATACA GAGTCAATAG TTAACCACAC TTAGGTCAAA TAGGATCTAA 420 
15 ATnTTCTTC CIGCTCCNGT OTAAAGAGGC CAGTGTTTGT GrTGTTGCAAG CAGCCTTGGA 



10 



20 



40 



55 



(2) INFORMATION FOR SBQ ID NO: 195 



480 



ATACJTAACTC TTCTCATrTG TTTGGGATCT GGCCAl«:AAG TICCAGAATG ATACACGGAT 540 

CAOTGCAGAA GTICATCAGG CTCTCC3GACC TTAGCSGCrGT TGGAGAAGGC TTCAGCAGCA 600 

GAACTCATCG TKAWKGYTCG TGTTCTCCAT CrTCAACTTT CTTTGCTTCG ATCATACACA 660 

AGAATACATT TGGAAGGGCA AAAAATGAAC ACTGTTGTTC ATTGCAGCCG TGrTTTTCTrGA 720 

25 CACAGATCCA CACTCTOCTG TGAAGACCTT CTCTCAAGTG GSATYTGGGA GTCCATQCCA 780 
GATCflJTOCTG CTTCAIGAGA GACTGACAGC TATCAGGGGT TGTGGCACTT AGrTGAGGACT 
CTCCTCCCCC AGTGTGTGCT GATGACACAT ACACACCTGA CAATAGCTTG AGTCTTCTCT 

30 

CTTCCrrTTA CTCTOTAGCC AACATACACA TGATTTAAAA CCCTTTCTAA ATATCTATCA 960 
TOCTTCATCC TTOTCCAAAT QCAGAGTCAG AGCTATTTGT ACTrCATTAT TATTTCCAAG 1020 
35 GCGAATACrrT GGCrrrCTTT TTGCAAAAAT AATTAAAGTr TrTGTATOrr GCAAAAAAAA 
AAAAAAAAAA CTACGTAG 



840 
900 



1080 
1098 



(i) SEQfUENCE CHARACTERISTICS: 
45 (A) LENGTH: 1001 base pairs 

(B) TYPE: nucleic aicid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

50 (xi) SEQUENCE DE9CRIPTICN: SBQ ID NO: 195: 

GAATTCQGCA CGAGATAGCT TGCATCTCAT COCAGTAAAA CCACTTATTT ATAACATATC 60 

AACcyrArrcA caagottcaa gagcaagatt gttctgaggt gagatgcaaa tttcaaaggg 120 

CriGAGCACTA ATTOTTCCAG TCATrGTTTA TTTATTGGCT AGGACATAAT TACTCTCTTT 180 
GAGGTTACAC ATCTQCCrcC AGGTTCCTGT GTGCTTGTGC CCTTOGGATC AGGCCAGGGC 240 
60 AGACTOIGAT CACTCAGATT CAAACTCCCA GARTAATCAG CAAGAGCTTT CTAGAGACCA 



300 
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AGGCCAGGCC TGATCCCTGA GQGATGCATG 
GGCTCTCTCT TGATCTTCTT GGAGTAGCAA 

5 

TAAATGATCA CAAAGGTAAA TGAGTAAAGG 
GAGAAATTAG CftCTGGGCTC TGCATTCAGA 
10 aXXXriTTATT GTGTAAGGAC ATGAAATTCC 
AGGGAATTGC AGACATATTT TArTTTATTT 
TCATGTACAT AATCATGCTT TAAAATATGT 

15 

TAGGCATTAT CTCATATAAT TOTCATTTTT 
AGCATTTTTA AAGAATACAA TGrGTTTTAT 
20 CTCTTGAACT TCTTCCTCTT ATCTAACTGA 
GCCCrTCCXC AACCACTGCT CCACCCGTrGG 
AATCACCATT CTAGACACAG GGAAGACTCT 

25 



AGAAGGCTTG GAATCTCATT CTGCTATGGT 360 

AAACAGCAAT GriGGGCCXIAA TQGTGrTGGCC 420 

GCrCAGCAGA TGAGTAAGGA GCCTTGTCCT 480 

AACATC3TCAT AAGCATTGCC CATTGCACAT 540 

AGTTTTGCAT AGCTAGTGAT GAATACCTGA 600 

TTAATTGACA GATGGAATTG TATATATTTA 660 

ACATTATGGA ATGGCTAAAT CAAACTAACC 720 

GTTGGCGAGAA GACTAAAAAT CTACCCTTTC 780 

TAACAACAGT CACCATTTGG TACACTAGAT 840 

GATCTTGTAA CCTITGATAA CAGCTCCCAA 900 

TAACCACCAT TCTATTCTCA ACTTCCTGGT 960 

CTACCCTCTG A 100^ 



(2) INFORMATION FOR SEQ ID NO: 196: 

30 



35 


(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1443 base peiirs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 196: 




40 


ATAAACKSAA ATAGGTCATG CAAATATAAA ATATTATTTT TAAATTATTT GTCATAAGAA 
AOGATCGTCG CCATATTTTC CTTTAATAAT GGAAAAAATG TGCTTAGCAT TCTKTGGAAG 


60 
120 




CyrGGTCAOXZA GATACTAGAC ATTTTCTAGG ATrrATTTCT ACCTGCATAT GTGGAAATGT 


180 


.45 


CTACTAcrrr AGArrTAiwr aatggcagct aactcagagg catcaaaatg tgctaatggt 


240 




CTAATATGGC CmCTCTPG CTGTYCTGTT TTGTARGCCT TCAATCAAGC ARGQGCAGGG 


300 


50 


CCGTACAGIG AACTTC3TCCT TTGSCAGACG CCAGCGTCTG CCCCPGACCC OGTCTCCACT 
CTCTGICTCC TQGAGGAGGA GCCCCTTGAT GCYTACCCTG ATTCACCTTC TQCGTGOCTr 


360 
420 




CTACTCAACr GGGAAGAGCC GTGCAATAAC GGATCTGAAA TCCrTGCTTA CACCATTGAT 


480 


55 


CTAGGAGACA CTAGCATTAC CGTGGGCAAC ACCACCATGC ATGTTATGAA AGATCTCCTT 


540 




CCAGAAACCA CCTACCGGTG AGTGCAAGGG AGTAGAAATC TGCATCAGCA CATCAGCACT 


600 




TGGGGATCTA ACTTAAACCTC TCGGGGAAAA TGACCAAGTG GATGTCATCT CCCAU-^iXiTT 


660 



10 



30 
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TCTAAGAGCX: CAGATGTCCA GAGTATTGTC TCACCTTGAT CCCTCACOCC P-^-J^rCTG 
TGAAT^AAGCC ACACTGGTTC AGGGACTCAC 'rSGACGGTTT TCTTGrTCC-^ YTA^CTTGC?. 

ccc?rcrcTAC cccagagtgg actcaratcc TCAAGrcArc CTCTGAAC.\T TG?j=rjrc.--a^ 



AATTATAAAA GGGCrTTGGC AATATGTTAG CCCAAX-ATT 



(2) INFX3RMATI0N FOR SEQ ID NO: 197: 



(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 1282 base pairs 

(B) TYPE: nucleic acid 

(C) STRANCmNESS: dc\i>le 

(D) TOPOIjOGY: linear 

40 txi) SEQUENCE DOESCRIPTICN: SEQ ID NO: 197: 

GAAAAAAAAA AOTATGACCC AGTAGCTAGG CACCT^-GC CCCGCCAiCT TG^JIPlCATAA 
AATTAACICT CACAGTATCA TCTTAGAAGT aW£=Jii5CC CCTTTATCCT GC?.GTGCCCC 
TCTACCACCA CCTACTGACA AAGAACATGG TGCIL^^.ZCIGG CATGGG^GAA i^-^nTCAGTT 
TGCTATGGCT TGTATGTGTC CCCTCAAATT CAAGCGrrTGC a\ATGTG?£A GO-JTCAAGAG 
50 (?K3GGCnCTT TAAGAGATCA CTAGGCCATG AGGGATTCTC TTAGGAC^GG G^'GAAGGCC 



45 



55 



60 CACrcCTTTA TTCTOGTCTC AGTATTGTGT GCTTAATAAG GAAATGAGAA AC-G^^IGGATC 



720 
780 
840 



C-^-Ji;ACTGT- 900 



960 
1020 
1080 



GCCGACNTTA ACAGTQGCTT AAATGATGGT .-^JVACTTTTA AGATI^.-A AAS3?rrGGC=. 
TTCGAGATAC GTTGACTTTT ATTAAACMAC CrATA:nTGT TTAATGAIiTT cr.-.-JiAAAAT 
ATCTGGAGCT CAGGGGTTCA ACTGAGGGAA CACA^TCmGA GRATCA-T:^ Tli^^AATTA 
15 AATCCCAGCT AACCCGTTGA AATTATCAAA AACATCTTCC ACGTACC-^ AASC?.rCTCA 1140 
GAGGATAGTT CTGTTATGGA GAAGATGAAA TSGTTr.^A GTGTAGa---i: TAT^GAAAGG 
TCAGCTTAGA TTTGGATAGT AAAACCTCAA GACCCTAZTT AAAAAGTATT •nATGAATGC 

20 

AGCATAAATA ATTTAATTCA GTGTTAANAT GCCAAGGCTA GTATATTG.=:G CTGiATGTGA 1320 
AAAGAAACrC ACATTQGGAG AATGCCACCT TCTCCTIATA AGATAGCTTT G^A^ATACCA 1380 
25 TTITAGACAG ATOSAAATTG AATAGCTTTA GAAAA3GCAA A/rGTTTGArC TTS^GGAAAA 
AAA 



1200 
1260 



1440 
1443 



60 
120 
180 
240 
300 



CATAAXAAAA GAGGTrTCAG GGAGCATCCT C-CTAC-rTTGC CiTCTGTATC T3AGAACACA 360 
GCAAGAAAGC CCTAGTCAAC AAGTTGCCAC^: TCCr?3?:rCT TAGACTICCC A- CCTCCAGA 
ACraiGAGAA ATACATnCT GTTCCTTACA AATIACCCAG -XTCCTCTAT ---•.iTATAG 
CAGCACAAAA TCAAGATACC ATACCTGAAC ACCT3AACAT TCTTCACAAG GrTAGTAAATG 540 



420 
480 



600 
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15 



30 



450 



AGGGCATAGG ATCAACAAGT TACTGCTAGA CCTCrOr.-Jl IC-CCACTA.-.T GG.- ".-J^iSArr 



AGATOTCTAA AAACACCTTA AGTATTTGTC TAGAAACCTG GrrGCAC^TGTC C?-J^-JL-i=;^Jir 
CAAAATICMA AATAATTTCA AAGGGCCTAA AGCACT.-KIT AATC^^-.-.-.r? a^^rriGrrTTT 
10 TAATGGTACT ACCACTCTCA AATTTAAAAT GTCATCTT.-- GITCCTrrrC CTCC-rATTGG 
ATTTATTGCT AAAACCTGGT AAACACTTTA ATCCrTTTCA ATrCC?Cr.--:: C.-Xr?r-CTCTT 



40 



ffTCTAATTCT GGTTACAAAT AAGTAACTGC CAAACTAA7C TTTCTAAAAA GC-A3ACTGA 



20 TTICCACICT GTATACAATA CATCCATGAT CTGrTATCC-i; C:;.TCATr7:G TA7r:JGCTCA 
CTTTATACAC CACCCCCCAT GCCACATCAA ATTAAArTAT CCTGA-------r GC-ATTGCAA 

AAAAAAAAAA AAAAAAACTC GA 

25 



(2) INFORMATION FOR SEQ ID NO: 198: 



50 

ATCCCAGACA TCAAATTAAG CTCAAATTAA GCTCTCXjrrT AAATCTTTAA ACACCTAArT 
TATATTCTAA TTCATCCCAG CCACTGATGC ATGTACTrrA GCTACHTrrG CTAAATAAGC 
55 ATATTAAITT TCCACATCAG GCCATCAGAT CTTGAGAACC AACAC^A-C "ZAGAATTCCX; 

TCTCTACTAA TOITrCACCT GCATGCAGCC rrCATr?^'2T TIGTAGCAAA AJAT AAAGTG 



660 



CTAimCAT CATTNCrrGT CTCITCGGAA GCTAAC-XTCA TGCTA-A-.--. GC-CA-.AAAT 720 



780 
840 
900 
960 



GTCCAGAATT ACTCGCAGAC TAATAGTCAC dGACTTCTC CCCCTC-CArC CCGATTrGCr 1020 



1080 



TCTCGTCACT CCTITGCICA ACAATGTAAA AGCTCCCATT GrTCTCCCAAA TAA-ACCAGC 1140 



1200 
1260 
1282 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 951 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
35 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: S3Q ID 170: Z98: 

ATTTCGGAAC GAGGACTCAA GTGGGAGCGG OGGCAGGCTIA G^^-C^S^-.^- QGGGGATCTA 60 

TCTGCTAACT AAAGAATGTT TCTGTTnGT TAATTATTGT GTGTGTSTGG — -i'-Ai-i'^rr 120 

TCCTTAAGAG AATCAAAAAC TGAAAAAAAT GAGAATACAG GAAATGGCTC TTGrrrTATTT 180 

45 TTTTGCTCIG TTTACAGCTT GTTAATGCTC TACK?Krm GTITTCA^iL-^ AGATTTGnTC 240 

ACTGCCCAGC TOGTTTTCTG TCCTGAGCCC TATQCCCAGC COrCTTArA AATO-TGCCT 300 

GriTAGATGrr TTGATITIGT TCTGTTTGCT ATTGTTATCT TAAAGGTG7A rAACTCTGAC 360 



420 
480 
540 
600 



60 



AtCATTATCT ACnTTCTGGA TTAAAAAAAT TTGrrGTGTGA AGTrGCTTIG -TAA^CTGCAT 660 
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451 

CyreGAATTAA TGGGACAGTC TGCCCTTrGT GTTAGATGTT AGAGCAAAAG AAAGGGCTTA 720 
TACTCTTAOr ATIX3GAGCAC TTTGAAGATA GATATTTTCA GAAAAGATGT AGGATTTAAA 780 
5 AGrrAAATTT TAAATTTTAG AAAAAGATAT GATGGCAATT GGAAATACTTC ACAATGAAGT 
TCTTCAICCA CyrAGGTCTTT AACAGTOTTA TTrTGCCACT GGTAATGTGT AAACTGTGAG 



840 
900 



TCA1TTACAA TAAATGATTA TGAATTCAAA AAAAAAAAAA AAAAAACTCG A 951 

10 



15 



(2) INFORHATION FOR SEQ ID NO: 199: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1740 base pairs 

(B) TYPE: nucleic acid 

(C) STBANDEDNESS : double 
20 <D) TOPOLOGY: linear 

(xi) SEQUE^KE DESCRIPTION: SEQ ID NO: 199: 

TTATTATAAT AATGATCATG ATTCCAAGGA AAAAACCTAC AGCGAATGriT CCATTTCTAC 60 

CCCGCAOGCA GACACTCTCC CTAACACTGA TAACCTGAGC CCCCAGCACT GGACGGAAGA 120 

ATGCTCGCOT CTCCCyrcrrGT ACTC5GTTCAG GGTTCTGGCC CCAGCCITGT CAGGACCCCC 



25 



35 



45 



55 



180 



30 TCCTCTCCAG AGCCCCCACC CCTCCCGCAA CAAGCAGCTG ATC5CCCCAGT GATTCTCTAT 240 



ACATmrCA CCTCGGCCAA TATGTCCAGG AAAACTC3CTT ACTTCTCTTT TCITGCCTGG 300 

AGCCTTCATT CHTCACCCTT ACGTTGCAAT ATAOGAATTA ATGCTACAAA ATAAAAGTAA 360 

AGCTTACCrc AAAACTC3CAT AGTTTC3GGGC AATQGTATCT ACATCTCCCA CTGTGGGAAA 420 

ACCAGCAAAG CATCAAAACT CTCAATTCTC CTGITACCRA ATGCAGATCT GAATTATAAG 480 

40 ATOTrTATGT TTCACCATTC TTTCAACAAT GGGATTTTGT TACGAATTAT CCCTTTAACT 540 

GAAACCCTCA GTTTrACTCT TTACATTATT AGGAAAACAG GGATATCTTT TGAATCTAAA 600 

AATITCATCT ACAGCATOTG ATTTTTGAAG TTTACATGTA AAGTCACAGT ATAGGTGAAA 660 

TAACCTTTOT CATAmTCA GACGTATCCT GCAGCCATGT TTrTACGTGA GTGriTTTAGT 720 

CAAACTTACAT GGTAGACAGT CTTTCACAAT AAAAGGAAAA GGATTTTTTT TCCTCCAAAT 780 

50 CJEACATTTAT CAACCTAATG ATTGATTTTT TTAAAAAGAG ATTTCGCCCC AGTCTGGTrT 840 

ATCAAAGrrrc ATTOCCCTAA ACTGTGCTGA TTGrmTAA TCAAGTTATA AATTTCCAAC 900 

CTAGATCATG TATCTACCAA CTCTCCTGCA TTTTCCAAAA GGCATTGAGC TTAAATAOTA 960 
CTCTTGCTTA GACTAGGTTA TCCACTTACA TGCTGCGCTA AAGCCATGCC TTTGAAACTC 
dTCTTTAAA ACATGATATG AmTTGTGG GCAGnTTCAG AAAAGAAAAC AAACAAACAA 



1020 
1080 



60 AAATCGACCC TTTAArraTT ACTTCCAACT CAACAGATCT CCCTQCCGTA CrGCCTTITC 1140 
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452 



CAGGAACTTT ACnCAGGCSC TGTCCAGATT GCAGTTCTGC CCCGTGTATG TGGATCTAGT 
TCACAGACTC TITOGAAGCC AGCAGTCGTG CCCTCCGTAT ACTGrrCCACT CATTTTATCT 
AGATTTGGTA TCCTCAGCAG CCAC?rGTTAA CACCAOTGrTC ACGTAGTTAN CAGATTCATC 
TTrrATCTAT TTAAACTAAT CXMACTATG ATTTGGTTTT TCCCTGCACC ATTAATTCTG 
10 GCATCAGATC AGTTTITOrG TTGrTGAACnT CTACTGTGCTr TTGACCCAAG ACCACAACCA 
TGAGACCCro AACTTAAAGAT AAGC3TACACA TACATTATTr GAGrTAACTCT TTCCTTQGGG 
GCCAATCTOT cyTATOCTTTT AGAAGTTTAC AGAATGCTTT TATTTTTGrC TATAACAAAC 

15 

ACTCTCTCAT TTATTTCTCT TGATAAACCA TTTQGACAGA GTCAGGACCST rrGCCCTGTT 
ATCTCCTAGT GCTAACAATA CACTCCAGTC ATGAGCCGGG CTTTACAAAT AAAGCACTTT 
20 TGATGACTCA MAAAAAAAAA AAAAAAAAMC YCGGGGGC3GG GCCGGTAACC CATTTNNCCX: 



1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 



25 (2) INFORMATION FOR SEQ ID NO: 200 



30 



35 



40 



45 



50 



55 



60 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1707 base pairs 

(B) TYPE: nucleic acid 

(C) STRANUEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 200: 
GCITATAGAA GGGAGAGGAG CGAACATGGC AGCGCGTTGG CGC?rTTTGGT GTGTCTCTGT 
GACCATQCTTC GTCGCGCTGC TCATCGTITTG CGACGTTCCC TCAGCCTCTG CCCAAAGAAA 
GAAGGAGATG (TrcTrATCTG AAAAGGTTAG TCAGCTGATG GAATCSGACTA ACAAAAGACC 
TOTAATAAGA AIGAATOGAG ACAAGTTCCG TCGCCTTGTG AAAGCCCCAC CGAGAAATTA 
CTCCGITATC GTCATCTTCA CIQCTCTCCA ACTGCATAGA CACHCTGTCG TTTQCAAGCA 
AGCTCATCAA GAATTCCAGA TCCTC5GCAAA CTCCTGGCGA TACTCCAGTG CATTCACCAA 
CAGGATATTT TTTCCCATCG TGCSATTTTGA TGAAGQCTCT GATGTATTTC AGATGCTAAA 
CAIX3AATPCA GCTCCAACTT TCATCAACTT TCCTGCAAAA GGGAAACCCA AACGGGGTGA 
TACATATGAG TTACAGGTGC GGGGTTTTTC AGCTGAGCAG ATTGCCCGOT GGATCC3CCGA 
CAGAACTOAT CTCAATATTA GAGTOATTAG ACCCCCAAAT TATGCTGGTC CCCTTATGTT 
GGGATTGCTT nGGCTOTTA TTGGTGGACT TGTGTATCTT OGAAGAGTAA TATGGAATTT 
CrcTTTAATA AAACTGGATC GGCnTTGCA GCrTTGTGTT TrGrrGCTTGC TATGACATCT 
GCTTCAAATOT GGAACCATAT AAGAGGACCA CCATATGCCC ATAAGAATCC CCACACGQGA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
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453 



10 



15 



20 



25 



30 



CATCIGAATT ATATCCATCG AAGCAGTCAA GCCCAGnTTG TAGCTGAAAC ACACATTCTT 

ciTCixyrrrA ATOcnxsGAcnr taccttagga ATCGrGcrrr tatgtgaagc tgctacctct 

GACATCGATA ITOGAAAGCG AAAGATAATG TGTGTC3GCTG GTAITOGACT TGTTGTACTA 
TTCTTCACTIT GGATCOICTC TATTTTTAGA TCTAAATATC ATGCXTTACCC ATACAGCTTT 
CTCATCACTT AAAAAGCTTCC CAGAGATATA TAGACACTCG AGTACTGGAA ATTGAAAAAC 
GAAAATCGIG TXHOTnGAA AAGAAGAATG CAACTTOTAT ATTTTGTATT ACCrCTTrTT 
TTCAAOIGAT TTAAATAOTT AATCATTTAA CXMAGAAGA TGTOTAGrrGC CTTAACAAGC 
AATCCICK?r CAAAATCTCA GCTATITCAA AATAATTATC CTCTTAACCT TCTCITCCCA 
OTGAACTTTA TOGAACATTT AATTTAGTAC AATTAAGTAT ATTATAAAAA TTGTAAAACT 
ACTACnrcr TrrAGlTAGA ACAAAQCTCA AAACTACrrT AGrrAACTTG GTCATCTGAT 
TTTATATTGC CTTATCCAAA GATOQQGAAA GrTAAGTCCTG ACCflGGTOTT CCCACATATG 
CdXHTACAG ATAACTACAT TAGGAATICA TICTrAGCrT CrrCATCTTT GTGrPGGATGT 
GTATACmA CGCATCrrrc CTTITCAGTA GAGAAATTAT GTGnOICATG TGGTCrrCTG 
AAAATOGAAC ACCAOTCm: AGAGCACACG TCTAGCCCTC AGCAAGACAG TTCOTTCTCC 
TCCrCCTTGC ATATrrcCTA CTCAAATACA GTGCTGriCTA TGATTGrrrTT TGTTTTGTTG 
rTTTTTYGAG ATCACGOTAC TGGGCTC 



840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1707 



35 



40 



45 



50 



55 



60 



(2) INFORMATION FOR SEQ ID NO: 201: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 779 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SBC^IENCE DESCRIPTION: SBQ ID NO 
CTOTCCCCAG TGTTTCCAGG TAATGACTTG GCACTCCAGA 
TOTGCTGGCT CCAAGCCAAG CACCTGGCAT GCAGGTCAGC 
TOTrCCTCTr CACAGATGCC ACGrTTGCAGC CCCAAGGCCT 
AAACCCATTT TCITOGTCAT TTATAAAGCP GCTTTATAGA 

crrGCjrrrcc Tcrcccrax: ctctttccaa TCCTGGTrrc 

TTCTCAACTC AACTCAAAGT CCCAAGAATT TQGAATGGTA 
GAGGCTGAGG CATAATCACT GCTTCGGnTC TGCTCATCAG 
TOGCAGCCAT GTTTGATTGT CACAGAGCCC CCCGAATACT 



201: 
GAAAGTITCA 
CCTTCCCAGC 
CACCATTTTG 
TATCmGAT 
CTAACCTCCT 
GGATGCTGTG 
GGGACACGCT 
CTGTCTATAG 



TRCTGTTGCG 

GGGCGTGGCG 

OGrTTTTTTAG 

CCTQGCATGC 

CrTGTAGTAA 

CGGGGAGCTC 

CCCTTACTCA 

TGACACACTG 



60 
120 
180 
240 
300 
360 
420 
480 
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454 



TAGGrCJTCAT AAATTTTAAG AAACCTGCTr TTAAGrTACTA TTTATAQC3TT TTTCrGTTAT 540 

AdTCCAACC TACnTTTAAA ATACATGAGG ATTTTATGAA AGCTTTATAC AGACATTTAT 600 

AGGAAACTCA TTCTTTCATT TTAGGTGCCA TTTAAATTGA TAACACTTAC TTTATAAAAA 660 

GATGCTTTTT OTTCCGATAG AGCCPTATAG TTTAAAATAT CTTCATATAT TGCCATTTGA 720 

10 TCAAATAAAT TTCTTACTTA GAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAACTCGA 779 



5 



30 



40 



50 



300 
360 



15 (2) INFORMATION FOR SEQ ID NO: 202: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1617 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEENESS : double 

(D) TOPOUOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 202: 

25 GGCACAGCTT TCICTCTCIT CCTCGCTCCC TCTCTTTCTC TCCTCCCTCT GCCTTCCCAG 60 

TCCATAAAOT CTCTCTCGCT CCCGGAACTT GrTTGGCAATG CCTATTTTIT GGCTTTCCCC 120 

CGCGTrCTCT AAACTAACTA TTTAAAGGTC TGCGGTCGCA AATGGTTTGA CTAAAOGTAG 180 

GATOGGACTT AACTTTGAACG GCAGATATAT TPCACTGATC CTCGCGGTGC AAATAGCGTA 240 
TCTGGTOCAG GCCGrTGAGAG CAGCQGGCAA GTGCGATGCG GTCTTCAAGG GCmTCQGA 

35 cronrocTC aagctgggcg acacatggcc aactacccgc agcctgqgac gacaagacga 

ACATCAAGAC CGICTGCACA TACPGGGAGG ATTTCCACAG CTGCACGGTC ACAGCCCTTA 420 
CGGATTGCCA GGAAGGGGCG AAAGATATGT GGGATAAACT GAGAAAAGAA TCCAAAAACC 480 
TCAACATCCA AGGCAGCTTA rTCGAACICT GCGGCAGCGG CAAOGGQGCG GCGGGGTCCC 540 
TGCrcCCGGC CyrrCCCGGTC CTCCTGGTGT CTCTCTCGGC AGCTTTAGCG ACCTGGCTTr 
45 CCTTCTGAGC GTOGGGCCAG CTCCCCOCGC GCGCCCACCC ACACTCACTC CATGCTCCCG 

GAAATCGAGA GGAAGATCCA TEAGTTCTTT GGGGACGrTTG TGATTCTCTG TGATGCTGAA 720 
AACACTCATA TAGGATTOtG GGAAAOCCTG ATTCTCTTTT TTATrPCGTT TGATTTCrTG 780 
TCrnTATTT GCCAAATCTT ACCAATCAGT GAGCAAGCAA GCACAGCCAA AATCGGACCT 
CAGCTTTACyr CCCTCTTCAC ACACAAATAA GAAAACGGCA AACCCACCCC ATTTTTTAAT 
55 TTTATTATTA TTAArmTT TTGTTCGCAA AAGAATCTCA GGAACGGCCC TQGGCACCTA 

CTATATTAAT CATGCTAOTA ACAOXSAAAAA TGATQGGCTC CTCCTAATAG GAAGGCGi^ 1020 
AGAGGAGAAG GCCAGGGGAA TCAATTCAAG AGAGATGTCC ACGGACGAAA CATACGGTGA 1080 

60 



600 
660 



840 
900 
960 
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455 



ATAATTCACG CTCACOTCGT TCTTCCACAG TATCrTGTTT TGATCATTTC CACTCCACAT 1140 

TTCTCCTCAA GAAAAGCGAA AGGACAGACT GTrGGCTTTG TGTTTGGAGG ATAGGAGGGA 1200 

GAGAGGGAAG GGGCTGAGGA AATCTCTGGG GTAAGAGTAA AGGCTTCCAG AAGACATGCT 1260 

GCTATCCTCA CTCAQGGGTT AGCOTTATCT GCTGriTOrTG ATGCATCCGT CCAAGTTCAC 1320 

TOCcnTA'rr rrcccTccrc ccTCTTGrrr TAGcrcrrTAC acacacagta atacctgaat 1380 

ATCCAACGOT ATAGATCACA AGGGGGGGAT GrTAAATGTT AATCTAAAAT ATAGCTAAAA 1440 

AAAGATTTTG ACATAAAAGA GCCTK3ATTT TAAAAAAAAA AGAGAGAGAG ATGTAATTTA 1500 

15 AAAAOTITAT TATAAATTAA ATTCAGCAAA AAAAGATTTG CTACAAAGTA TAGAGAAGTA 1560 

TAAAATAAAA CTTATTCTTT GAAAAAAAAA AAAAAAAAAW CTCGACCGCA AGGGAAT 1617 



10 



20 



35 



45 



55 



(2) INFORMATIOT FOR SE5Q ID NO: 203 



(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 1974 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : doiable 

(D) TOPOLOGY: linear 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 203: 

GAAT-rCGGCA CGAGGCTGAG GGAGCTGCAG CGCAGCAGAG TATCTGACGG CGCCAGGTTG 60 

CCTAGCTGCG GCACGAGGAG TTTTCCCGGC AGCGAGGAGG TCCTGAGCAG CATGGCCCGG 120 

AQGAGCGCCT TCCCTGCCGC CGCGCTCTGG CTCTGGAGCA TCCTCCTGTG CCTGCTGGCA 180 

CTCCGGGCGG AGGCCGGGCC GCCGCAGGAG GAGAGCCrGT ACCTATGGAT CGATGCTCAC 240 

40 CAGGCAAGAG TACTCATAGG ATTTGAAGAA GATATCCTGA TTGTTTCAGA GGGGAAAATG 300 

GCACCrTTTA CACATGATTT CAGAAAAGCG CAACAGAGAA TGCCAGCTAT TCCTGTCAAT 360 

ATCCATTCCA TGAATTTTAC CTGGCAAGCT GCAQGGCAGG CAGAATACTT CTATGAATTC 420 

CrorCCTTGC GCTCCCTCGA TAAAGGCATC ATGGCAGATC CAACCCnCAA TGrrcCCTCTG 480 

CTGQGAACAG TGCCTCACAA GGCATCAGTT GTTCAAGTTG GTTTCCCATG TCTTGGAAAA 540 
50 CAGGA3X3GGG TOGCAGCATT TGAAGTGGAT GTGATTGTTA TGAATTCTGA AGGCAACACC 
ATTCTCCAAA CACCTCAAAA TGCTATCITC TTTAAAACAT GTCAACAAGC TGAGnPGCCCA 

GGCGGC7IX3CC GAAATG6AGG CTTTTGTAAT GAAAGAOGCA TCTGCGAGTG TCCTGATGGG 720 

TTCCACGGAC CTCACTOTGA GAAAGCCCTT TGTACCCCAC GATGTATGAA TGGTGGACTT 780 
TCTC?IGACrC CTGGrrTTCTG CATCTQCCCA CCTGGATTCT ATGGAGTGAA CTGTGACAAA 
60 GCAAACroCT CAACCACCrc CTTTAATOGA GGGACCTGTT TCTACCCTGG AAAATGTATT 



600 
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TSCCCrcCAG GACTAGAGGG AGAGCAGriCT GAAATCAGCA AATGCCCACA ACCCTOTCGA 
AATOGAQCTA AATCCATTGG TAAAAGCAAA TGTAAGTKTT CCAAA0C3TTA CCAGGGAGAC 
CrcTCTTCAA AGCCrcrrCTG CGAGCCTGGC TGrTQGTGCAC ATGGAACCTG CCATGAACCC 
AACAAATCCC AATCTCAAGA AQGTTGGCAT QGAAGACACT GCAATAAAAG GTACGAAGCC 
AGCCTCATAC ATGCCCTCAG C3CCAGCAGGC GCCCAGCTCA GGCAC3CACAC GCCTTCACTT 
AAAAAGGCCG AGGAGCX3GCG CSGATCCACCT GAATCCAATT ACATCTGC?rG AACTCCGACA 
TCTCAAACGT TTTAACTrrAC ACCAAGTTCA TAGCCTTICT TAACCTTTCA TCTOTTGAAT 
CTTCAAATAA TOTICATTAC ACTTAAGAAT ACTGGCCTGA ATTTrATTAG CTTCATTATA 
AATCACTCAG CTGATATTTA CTCTTCCTTr TAAGTTTTCT AAGTACGTCT GTAGCATGAT 
.GGTATAGATT TTCTTOTTTC AGTOCTTTGG GACAGAITTT ATATTATGTC AATTGATCAG 
GTrAAAATTT TCACTTGriCTA GTTGGCAGAT ATTTTCAAAA TTACAATGCA TTTATGCrrcT 
CTCGGGGCAG GGGAACATCA GAAACSCrTTAA ATTQGGCAAA AATGCGTAAG TCACAAGAAT 
TTCC3ATGCTG CAOTTAATGT TGAAGTTACA GCATTTCAGA TrTTATTGTC AGATATTTAG 
ATCTnCTTA CATTTTTAAA AATTGCTCTT AATTTTEAAA CTCTCAATAC AATATATTTT 
GACXriTACCA TTATTCCAGA GATTCRGTAT TAAAAAAAAA AAAATTACAC TGTGGTAGTG 
GCATTTAAAC AATATAATAT ATTCTAAACA CAATGAAATA GGGAATATAA TCTATGAACT 

rrrrccATTC gcitcaagca atataatata ttctaaacaa aacacagctc ttacctaata 

AACATTTTAT ACTCTTTCTA TGTATAAAAT AAAGGrPGCTG CrTTAGTTTT CTGA 



960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1974 



40 



45 



50 



55 
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(2) INFORMATION FOR SBQ ID NO: 204: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LaJGTH: 1057 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: double 

(D) TOPOliOGlf: linear 

(xi) .SEC2UENCE DESGRIPTION: SEQ ID NO: 204: 
CGGCCTTCCG GGGCAACCC?r TCGTCCCAAC NCGGGAAAGG GTCCTGGAGN CGGGAACTAG 
GAGCCICGGA ACJTCCAAGGG CGGAGCGCCC TTTGCTAATA AGCCAATCAG AACGTGAGAC 
GCTCCGGIGG CaKXXntSCCG TCGAGCGCGG GGTGGAGTCT GGGTGACTTG GCTGGCGGGA 
TCAAGTGCAG CIGCTTCAGG CTGAGGTQGC AGATAGTGAG CGCTGGTGGC GGAGrTAAAG 
TYAAAGCAGG AGACTAA3MA TGAATAGCGC AGCGGGATTC TCACACCTAG ACCGTCGCGA 



60 
120 
180 
240 
300 
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GCXSGCTTTCTC AAGTTAGGGG AGAGTTTCGA 
CGCTATGACT TCAAACCTGC TTCTATTGAC 
5 GAAGKTCAAC AGKTGACCAT WACTCTGCCM 
TCAGTATCGT AAAGAACAAC AGCAACAACA 
GTAAAACATT CTCCATCTGA AGATAAGATG 

10 

AGAGAACTGA AGGCAGAAGC TAGTCTAATG 
GATTCCAAAA GTTCATCATC TTCAAGTAGT 
15 GATTGCAAAT CCTCTACTTC TGATACAGGG 
ACAGTACftGG ATTCCTGATA TAGATGCXaG 
TCTGATGAAT ACTTTAAGAA ATGATTTGCA 

20 

CTCAAGAAAT ATTTAGCTAT AAATAAAAAT 
AACAATAAAA ATTCCTAAGA CTGAGGGAAA 
25 TAAATTTGAT TCAGAAAAAA AAAAAAAAAA 



457 

GAAGCAOXG CXXTTGCGCTT CCACACTGTG 360 

ACTTCTTCTG AAGGATACCT TGAGKTTGGC 420 

AATATAGAAA GTTGAAGGAA GCAGTAAAAT 480 

ATGTOGAATT CASCTAGGAC TCCCAATCTT 540 

TOXCAQCAT CTCCAATAGA TGATATCGAA 600 

GACX3tf3ATGA GTAGTTGrrGA TAGTTCATCA 660 

GAGGATAGTT CTAGTGACTC AGAAGATGAA 720 

NAATTGTGTC TCAGGACATC CTACCATGAC 780 

TCATAATAGA TTTCGAGACA ACAGTGGCCT 840 

GCTCAGTGAA TCAGGAAGTG ACAGTCATGA 900 

TTATACAGCA TGTATAATTT ATTTTGTATT 960 

TATGTCTTAA CTTTTGATGA TAAAAGAAAT 1020 

AACTCGA 105'^ 



30 (2) INFORMATICN FOR SEQ ID NO: 205: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 721 beise padrs 

(B) TYPE: nucleic acid 
35 (C) STRANDEDMESS: doiible 

(D) TOPOXjOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 205: 

40 GAATTCGGCA CGAOTCATCC CTCTCCCTCT TTCACTCCCT TACTCTTACT CTGTTTTTTG 60 

TOCTCCAGAC AGACAGACCC TACCTdTTT GCTTCTTTTT TGTTrGTrTG TTTTGAGATG 120 

GACTGTCGCT CTICTTCCCC AGGCTGGAGT GCAGTGGCGC AATCTCQGCT CACCACAACC 180 

TCTCCCTCCC QGGTrCAAGC AATTCTCCTG CCTCAGCCTC CCGAGAAGCT GGGGATTACA 240 

GGCATCCGCC ACCACACCCA GCTNAATTTT ATArTTTTAG TAGAGATGGT GTTTCTCCAT 300 

50 CTPGGTCAGG CTOGCCICAA ACTCCCAACC TCAGGTGATN CCGCCTGCTT TGGCCTCCCC 360 

AAACjroCTGG GATEACAGGC GTGAGCCACT GCGCCCAGCC TCTTTTGCTC CTTTATACTC 420 
ATTAACICAC GCCTC?rAATC CCTGTTTTGG GAGGCCAAAG TGAGAAGGTT GCTTGAQGCC 



45 



55 



480 



AAGACTTIGA GACTAGCCTC GGCAACACAG CAAGATGCCA TCTTTATAAT AAAAATAAAA 540 
ATAAAAATCA ATTAGCTGGG CATGGTQGAA CGCACCTGTA GTCCCAGCCA ATTGAGAC5GC 



600 



60 TX3AACTGGGA GGATCATTGA GCCCAGGAGT TGAGGTTGCA GTGAGCTATG ATCATGTCAC 660 
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TACACTCAGC CTOGGCAATA GftGC3GACATG TTCTCTCTAA AAAAAAAAAA AAAAAACTCG 
A 



40 



(2) INFORMftTION FOR SEQ ID NO: 206: 



50 



720 
721 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2465 base pairs 

(B) TYPE: nucleic acid 

(C) STEIANDEDNESS : double 
15 (D) TOPOLOGif: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 206: 

CCACCATTTA TCCAAdGAA GAGGACTTTAC AGGCAGOTCA GAAAATTCOT TCTATTACTG 

AACGTCCTTT AAAACTCCTT TO^GACAGTT TCTCTGAACA TG^^ AAGAACAAAG 

AC3C3GAGATCA TAAGAAAGAG C3GAGGTAAAG ACAGAGCTIT GAAAGGAGTT TTGCGAGTQG 

25 GAOTATTCGC AAAAGGATTA CTrCTCOGAG GRGATAGAAA TGTCAACCTT GTTTTGCTCr 

GCTCAGAGAA ACCnCAAAG ACATTATTAA GCCGTASTCC AGAAAACCTA CCCAAACAGC 

TTCCTCTTAT AAGCCCTCAG AAGTATGACA TAAAATGTGC TGTATCTGAA GCGGCAATAA 

TTTTGAATTC ATCTCTGGAA CCCAAAATC3C AAGTCACTAT CACACTGACA TCTCCAATTA 

TTCGAGAAGA GAACATCAGG GAAGGAGATG TAACCTCGGG TATOGTGAAA GACCCACCGG 

35 ACGICTTOGA CAGGCAAAAA TOCCTTCAOG CrCTGGCTGC TCTACC3CCAC GCTAAGTGC?r 

TCCAGGCTAG AGCEAATOCT CTCCAGrrcCT GTGTGATTAT CATACGCATT CTTCGAGACC 

TCnXnCAGCG AOTTCCAACr TOGTCTGATT TTCCAAGCTG GGCTATGGAG TTACTAGTAG 

AGAAAGCAAT CAGCAGIGCT TCTAGCCCTC AGAGCCCTGG GGATGCACTG AGAAGAGnTT 

TTCAATGCAT TICTrCAGGG ATTATTCTTA AAGC3TAGTCC TGGACTTCTG GATCCTTGTG 

45 AAAAGGKTCC CTITCATACC T3X3GCAACAA TGACIGACCA GCAGCGTGAA GACA3CACAT 

CCAffrCC^ CTITOCAITC AGACTCCrrG CATTCOGCCA GATACACAAA GITCTAGGCA 

TOGATCCATT ACCGCAAATC AGCCAAOGTT TTAACATCCA CAACAACAGG AAAOGAAGAA 

GAGATAGTCA TOGACTTCAT QGATTTGAAG CTGAGGGGAA AAAAGACAAA AAAGATTATG 

ATAAOTTTA AAAACTTCrrCT CTAAATCTTC ACrTCTTAAAA AAACAGATGC CCATTTGnTC 

55 GCTOn™ AITCATAATA ATOICTACAT TGAAAAATTO ATCAAGAA3T TAAAGGATTT 1140 

CAIOSAAGAA CCAAGOTTIT CTATCATAIT AAAAAATGTA CAGTOTTAGG TATTATTTGA 

ATOGAAAGAC ACCCAAAAAA AAAAATOTOC TCCGACTAGG GGGAAAACAG TAGTrCCGAT 

60 



60 
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TrmCCCAT TAmTTATT TTATTTTCTG CTTTGCCCTAG CTTCCCCXXX: TATTTnCTG 
TCmTATrA ACTAGTCCAT TOTCTTATTA AATCTTCACT GTATTTAATG CAGGATGTGT 
GCTTCACTrG CTCTCTGTAT TTTGATATTT TAATTTAGAG GrTTTTGriTTG CTTTTTGACA 
CTAC?ITCTAA OTTACTITCT TATAGATOTT ATCCTTTACC CCTTCTTAAT ATrTTACAGC 
ACTACGmr mCTAACGT GAGACTGCAG AGTrrGTTTT TCTATATGTC AAGGATTACA 
ACACAAAAAG TTATCCTCCC ATTCGAGrTGC TCAGAACTGA ATGTTTCTGC AGATCTTCTG 
GCATirorcT CTACTTOIGAT ATATAAAGCTT GTAATTAAGA CAGACmCTG TTAATCTAAT 
CAAGTITCCT C?rTA£?rTCrG CATTAGCAGT ATAAAAQCTA ATATATACTA TATGGTCTTG 
CAACACTTTT AAAGCCTCTG CATAATTGAT AATAAAAATG CATGACATTC TrGTTTTTAA 

TAGAcrrrrA aaatcataat tttaggttta acacgtagat crrrGTACAG ttgacttttt 

GACATAGCAA GGCCAAAAAT AACTTTCTGA ATATTnTTT CTTGTGTATA AGTGGAAAGG 
GCATTTTTCA CATATAAGTC GGCTAACCAA TATITPCAAA AGAACTTCAT CATTGTACAA 
CTAACAACAG TAACTAGCXX: TTAATTATGG TGACAGTTCC TTATTGGTCT GTCTGAGftTT 
ACTCTAGCAA CTATTACAGT ATAACACAGA TGATCTTCTC CACACACCCC ATCACCCAGA 
TAATTTACAG TrCTCTTAAC AGrTGAGGTTG ATAAAGTATT ACTGATAAAA AATTATCTAA 
GGAAAAAAAC AGAAAATTAT TTGGTCTQGC CATCTTACCT GCTTATOTCT CCTACACAAA 
GCTAAATATT CTAGCACyTGA TC?rAATGAAA AATTACATCT TACTGrTTGAT ATATGTATGC 
TCTGCyrACAC AGATOTCATT TKHTOTCAC AGCACTACAG TGAAATACAC AAAAAATGAA 
ATTCATATAA TCACTTAAAT GTATTATATG TTAGAATTCA CAACATAAAC TACmTGCT 
TTCAAATCAT C?rATCCTTCA GTAAAATCAT ATTCAAATTT AAAAAAAAAA AAAAAAAAAA 
CTCGA 
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(2) INFORMATION FOR SBQ ID NO: 207: 

(i) SBQOENCE OJARACTERISTICS : 

(A) LENGTH: 1480 base pairs 
50 (B) TYPE: nucleic acxd 

(C) STRANDEDNESS : dotable 

(D) T0P0UX3Y: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 207: 
GAATTCGGCA CGAGCTCAAG CTOGCAGGTG GTCGGGGGAG CGGCCGGAGA GGAGCTGCOG 60 
GGACTTTCOTG CXXTTOCAGGA CATOACACCA GTGGCATATC ACGGCCATGG GGTCTCAGCA 120 



60 TTCCGCTGCT GCTOGCCXXTT CXTCCTGCAG GCGAAAGCAA GAAGATGACA GGGACGGmT 
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GCTGGCTCAA CGAGAGCAGG AAGAAC3CCAT TCSCTCAGTTC CCATATGTGG AATTCACCGG 240 

GAGAGATAGC ATCACCTOTC TCAOGTCCCA GC3C3GACAGGC TACATTCCAA CAGAGCAAGT 300 

^ AAATGAGTTG GraSCTTTCA TCCXa^CACAG TGATCAGAGA TTGCXSCCCTC AGCXIAACTAA 360 

GCAAIATCTC CTCCTCTCCA TCCTGCTITG TCTCCTGGCA TCTGGTTTOG TGGTTTTCIT 420 

10 CCTCTrrCCG CAnCACTCC TTGTQGATGA TGACGGCATC AAAGTPGGTCA AAGTCACATT 480 

TAATAAGCAA GACTCCCTTC TAATTCTCAC CATCATGC5CC ACCXTGAAAA TCAOGAflCTC 540 
CAACTTCTAC ACGOTGGCAG TGACCAGCCT GrrCX:AGCCAG ATTCAGTACA TGAACACAGT 
GGTCAATTTT ACCGGGAAGG CCGAGATC5GG AGGACCGrTTT TCCTATGTGT ACTTCTTCTG 
CACQCTTACCT GAGATCCTGG TGCACAACAT AGTGATCTrC ATGCGAACTT CAGTGAAGAT 



15 



25 



35 



45 



50 



(2) INFORMATION FOR SEQ ID NO; 208 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGIH: 872 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : doiible 
55 (D) TOPOLOGY: linear 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 208: 
CACTATTTCC CrCAC?rACTC TAAGCAAAAG TCSGrrATGrTTT TPCTnCTTT ATGTCTACTC 



600 
660 
720 



20 TTCATACATT GGCCTCATGA CCCAGAGCTC.CTTCX3AGACA C^ 780 



840 
300 



AGGAAATTCC ACAGCTATTT AACAACTGCT ATTGGTTCTr CCACACAGCG CCTGTAGAAG 
AGAGCACAGC ATATCITCCC AAGGCCTGAG TTCTQGACCT ACCCCCACGT GGTGTAAGCA 

GAGGAGGAAT TOC5TTCACTT AACTCCCAGC AAACATCCTC CTGCCACTTA GGAGGAAACA 960 

CCrcCCTATC CTACCATTTA TOITrCTCAG AACCAGCAGA ATCAGTOCCT AGCCTGTGOC 1020 

30 CAGCAAATAG TTCGCACTCA ATAAAGATTT QCAGAATTTA ATACAGATCT TTTCAGCTGT 1080 

TCTTAGGGCA TTATAAAK3G AAATCATAAC GTGGrTTCTAG GrTTATCAAAC CATGGAGTGA 1140 

TCTTCGAGCTA GGATTGTCAG OXffiCCTGCAG GCCAITATCA GrTGCCTCATC TGTGCAGAAG 1200 

TCGCAGCAGA GAGGGACCAT CCAAATACCT AAGAGAAAAC AGACCTAGTC AGGATATGAA 1260 

TTrCTTTCAG CTOITCCCAA AGGCCTOGGA GCTTTTTGAA AAGAAAGAAA AAAGTGTGTr 1320 

40 UX - mTnT TTTITEAGAA AGTITAGAATT GTTTTTACCA AGAGTCTATG TGGGGdTGA 1380 

TTCACCCITC ATCCATTOGC TGGAACATGG ATTGGGGATT TGATAGAAAA ATAAACCCTG 1440 
CmrPGAITC AAAAAAAAAA AAAAAAWAAA AAAAACTCGA 



1480 
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TG?rccrcTC?r GCSCCTTCTGG TCTACCXTTC TCTTCCTAGC CATTCAGTCT CrCTAGTCAC 120 

CTCCCTAGrrA GCTACTTCCrc aCTAAGTTTT TATTTAATTA GAACAACTCC ATrrCCATrT 180 

CAAGCTAGGT CAATCGGGGG AAAAGCCTCA TGATTTAAAC TGAAGTTAAC AACACAGCTT 240 

TTAAAATGAA AACTCATACT CCAACTTCTA AAGTATATTT GAGCTGAITT GrTTCCAAAA 300 
CAAAGATATG CICTACCTAA AACTOCTAAA ACAAAAATAT AAAGACAAGG ACTAGGTGAT ' 360 

TAAGCX3GAGA GAAAAATCAT YTCTTTTCCA GGAAACCTTT GCTAAAATAA GCAAAACTTG 420 

AMrCTA2X3CT TCATCGAAAC TGACACAAAG AAAAGAAACT GATGGATTGC ACAGGCCTTG 480 



15 TTATAGAAAT AGATCTATAA AAAGATCTGT CCACAGGAAA TATACACCTT CTCCTGCTrrC 540 



55 



600 
660 



TGAACTTCAA TCGGGATTTG TCACCTAGGT CTCCATCTAT AGGAATACCT TCACATACCT 
ATCTATTCAT GCACATATTC TGAAAACAGG TACATACAAA ATTACAACAA AGGAAAAAAA 

ITCTATTCAA CACTTAAAAA TAGAAACAGG CCAGGCACGG TGGCTCATOC TGTAATCCCA 720 

ACAATTTGCSG AGGCTCAGGC TGGTOGATCA OCTGAGGTCA GGAGrTGTGAG ACCAGCTTGG 780 

25 CCAACMX3Gr GAAACCCCCT CACTACTAAA AATACAAAAA AAATTAGCCT GTOTGCTTOGC 840 

ACACTCOTAC AATCCNGGCT GACTCGGGAA AN 872 

30 o 

(2) INFORMAIIQN FOR SEQ ID NO: 209: 

(i) SEQUENCE CHARACTEEaSTICS : 
35 (A) LENG?ra: 1779 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEBNESS : double 

(D) TOPOLOGY: linear 

40 <xi) SEQUENCE DESCRIPTION: SEQ ID NO: 209: 

AATTOCCAAG ACTQCACAAA ATTACAGTOC TAATGTATAT GGTTGCAGTT CACATAAAGA 60 

CAAAAGCATC TOTTATGAAA TGAGTACTAA TATTCSQGTGG TTGaOTTCTT CTTAGCAGAC 120 

TKX3CTTCAT WITOGTCTTG AGATAAAATG GCCAGCATAA ATGCTGTTTA TATTCACGTT 180 

TTCCTAGCTTG TGTOTCTGCA GGCCACAGCA GCATGCCCTT GGTGrrAGTCA GTGCCGAAAS 240 

50 GGcyrcTOTTC cttcttcagc ctgcctgcag ggatggtctc cttttaaagc AGGrrcrrGrrG 300 

CAGCATTCAG TACACTGAAG GTAAGCTAAA CCATCAACAT CTCTGGTGTT TTAAGATGTT 360 

ATrrrATrGG aacaactgac aaatcaggga TGrrrAGcrrT gtggcagaat tccctgcatg 420 

TGItSATAACT GATCTTOTTT TAnTTTTGG CATTCCAACT GTGGCATAGT TACAATTTCT 480 
ctitcktcat CACATTTAAA ATTCGRAGAG AACGCGCTTG AKGGATAGAG CGCCTTCAGK 540 
60 CTTACTCTm: TTATTAACTT TACTTTTrTT AAATCAACTT GCTATAGACT TTATATACAT 600 
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TTICTTAAAT ATACTTCCTA GTGACATAGA AACGATGCGT AGTTTTCATT TACTAATTAC 660 

AAATOTTGAG GCXrCAATTCT GAAAGTCCTC ATATTTAAAG GCTAGACAAC GTAATGAAAT 720 

TTTTAACTAT TTCTATGTCA TTITGAAAGT GTACTGCTTT ATGC3TAAAAG TGTTITrCAT 780 

TIOTTCATTG TTTTCATTAT TTCTGATCAT GITOTCTTTC AATACAGGCA TAAACCTTCC 840 

10 ACTCTTGAAC AAAGCAGCTC CnTTTAAAA GCGGTAATTG CTTCTrTACC TTTTATTTCT 900 

TTICTAAATG AAGCTTTTCT TTAAGAATOr GACTTTAAAG TCTTGTCTAT TGCATAAAAC 960 

AGITCACACr CACTTATTGT AAAGTGAAGA TTGTlCTACr GCATGTGAAG TGGACCATGC 1020 

AGArrrCTCT ATOTTCTCAG TA-roCATCAC TAGATAATAA AGTCTTTTGT GAACAAC3GCA 1080 

TircrAGCCA rrrrTAAAAG ' i - rm tJirrr cagtgctggt aagtcasgta aaccataaat ii40 

20 ACTTAAAAGC AACCTTrTCT TTTTTTCCTG AAAGTmTA ATTGAAAGTA TTATTAGrTTA 1200 

AAGATOTAAA CCTAGCCAAA ATTACCAGTT TATTAATAAT TAGGATCCTA ATTATTTCAA 1260 

AAAATCCTAC AAATATTCTC AGCTTTCAGT GTAGTGAGAT TATTCCTGTA GGTTATGQC3G 1320 

25 

TATAATTCAG GATTTAACTA ATGrTTCTCC TATTTTCTCA CTTTTCCTTr TGATGC3TCOG 1380 

GAAflGAGAAA AAGGAAAACG GGGCACflGGC CATTCGACGC CTTCTCCAAG GGGTCTGATT 1440 

30 TGCTCAGACA CCAGCTICAC CTTCTrAACA AGGCACCTAA TTACAACAAG CATGCACATT 1500 

TTOCTGCArr CAAGAATC3GA AAATCAGAAT AGCAGCATTG ATTCTTCTCG TGCAGCTCAG 1560 

TOGAflGATGA TCACAACCAG AAGACATGAG CTAAGGGTAA GGGACTGrTPC TGAAGAACCT 1620 

TTCCATTTAG TCATCAAGAT ATC3GAAGCTG ATTTCTGAAA ATGCTCAGTC TGTACTCTAA 1680 

nATTTATOG TACCATTTGA AnCTAACTT GCATTTTAGC AGTOCATGTT TCTAATTGAC 1740 

40 TTACTGGGAA ACTGAATAAA ATATGCCTCT lATTATCAA 17^9 



45 (2) INFORMATION FOR SEQ ID NO: 210: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2110 base pairs 

(B) TYPE: nucleic acid 
50 (C) STRANDECNESS: double 

(D) TOPOLOGlf: linear 

(xi) SBQOENCE DESCRIPTION: SEQ ID NO: 210: 
55 GCGC3CCGCTC CAGCCCOGAG CTGAGCTAGC CGTCCGAGCC GAGCCGTCOG AGC00C3GGAA 60 
GCCGGCGCCST GCTGCCGCTC GTGGCGGCCA GAGGAGAGGA GAGGCAGCAG CATQGCGAGT 120 
GTCCTOTCCC GAOGCGTTGG AAAGCGGTCC CTCCTQGGAG CCCGGGTGrTT GGGACCCAGT 

60 



180 



wo 98/54963 



PCT/US98/11422 



10 



20 



30 



50 



463 



GCCTCGGAGG GGCCTCGGCT GCXXXrAOCCT CXXy^GCCACT GCTAGAAGGG GCCGCTCCCX: 240 

ivsccrrrcAC cacctctcat grcaccccct GCCAGGAGCA GCCCAAGGAA GTCCTTAAGG 300 

CTCCCAGCAC CTCGCSGCCTT CAGCAGGTGG CXTTTTMAGCX: TGGGCAGAAG GTTTATGTGT 360 

GCTTACCSGCSGG TCAAGACTX3C ACAGGACTGG TOC3WGCAGCA CAGCTGGATG GAGC3GTCAGG 420 
TGACCX7ICTG GCTGCTGGAG CAGAAGCTGC AGOrCTQCTG CAGGGTGGAG GAGGTGTGQC 



GCCAAACKTT GGCAAAOTTC TGCGCTCCAT TGTGGGCATC AAACGACACG TCAAAGCCCT 
CCATCTGGGG GACACAGTGG ACTCTGKTCA GTrCAAGCGG GAGGACSGATT TCTACTACAC 



480 



TOGCftGAGCr GCAGGGCCCC TGTCCCCAGG CACCACCCCT GGAGCCOQGA GCTCAGGCCC 540 



600 



TQGCCTACAG GCCCGTCTCC AGGAACATCG ATGTrCCCAAA GAGGAAGTCG GACGCATGGA 
15 AATGGATGAG AIGATGGCGG CC3VrGGTGCT GACGrTCCCTG TCCTGCAGCC CTGTTGTACA 660 
GACTCCTCCC GGGACXIGAGG (XAACTTCTC TCCTTCCCGT GCGGCCTGCG ACCCATQGAA 720 
GGAGAC?rGG?r GACATCTCGG ACAGCX3GCAN CAGCACTACC AGCGGTCACT GGAGTGGGAG 780 
CflC?roC?It3TC TCCACCCCCT CGCCCCCCCA CCCCXrAGGCC AGCCCCAAGT ATrTGGGGGA 
TCCTTrroGT TCTCCXXIAAA COGATCATGG CrPTGAGACC GATCXTTGACC CTTTCCTCCT 



840 
900 



25 GGACGAACCA GCTCCACGAA AAAGAAAGAA CTCTGTGAAG GTGATGTACA AGTGCXnXTIG 960 



1020 
1080 



AGAGGTC3CAG CTCAAGGAGG AA3CTGCTCC TX3CTGCTGCT GCIGCTGCCG CAGACCCCXIA 1140 



1200 



GTCCCTOGGA CTCCCACCTC CGAGCCAGCT CCCACCCCCA GCATGACTGG CCTQCCTCTG 

35 TCTCCTCTTC CACCACCTCT GCACAAAGCC CAGTCCICCG GCCCAGAACA TCCTGGCCCG 1260 

GACTCCrcCC TGCCCTCAGG GGCTCTCAGC AAGTCAGCTC CTGGGTCCIT CTGGCACATT 1320 

CAGGCAGATC ATCCATACCA QGCTCTGCCA TCCTTCCAGA TCCCAGTrCTC ACCACACATC 1380 

40 

TACACCAGTG TCAGCTOGGC TOCTGCXXrC TCCGCCGCCT GCTCICMrC TCCGGTCCGG 1440 

A0CCX3CTCGC TAAGCTTCAG CXMGCCCCA GCAGCCAGCA CCTGCGATGA AATCTCATCT 1500 

45 GATCGTCACT TCTCCACCCC GGGCCCAGAG TC3GTGCCAGG AAAGCCX3GAG GGGAGGCTAA 1560 

GAA(7IX3CCX3C AAGTOTATGG CATCGAGCAC CGGGACCAGT GGTGCAOGGC CTGCCGGTOG 1620 

AAGAAGGCCT GOCAGCGCTT TCKX3?W=TCA GCTGTGCTGC AGGTTCTACT CIOTPCCTGG 1680 

CCCTQCCGGC AGCCACTCAC AAGAGGCCAG TGTGrTCACCA GCCCTCAQCA GAAACCGAAA 1740 

GAGAAAGAAC GGAAACACGG AGrTTTGGGCT CTGTrOQCTA AGGTOTAACA CTTAAAGCAA 1800 

55 TmcrcccA TTCTCCGAAC ATTTTAITTT TTAAAAAAAA GAAACAAAAA TATTTTTCCC 1860 

CCTAAAATAG GAGAGAGCCA AAACTGACCA AGGCTATTCA GCAGrGAACC AGriGACCAAA 1920 

GAATTAATTA CCCTCCGrXT CXXZACATCCC CACTCTCTAG GGGATTAQCT TGTGCGTGTC 1980 

60 



wo 98/54963 



464 



PCT/US98/11422 



25 



35 



45 



AAAAGAAGGA ACAGCTCCTT dQCTTCCrG CTCAGTCGGT GAATTCTTTC CTTTCTAAAC 2040 
TCTTCCAGAA AGGACTCTGA GCAAGATGAA TTTACirTTC TTAAAAAAAA AAAAAAAAAA 2100 
5 AAAAACTCGA 

10 (2) INFORMATION FOR SEQ ID NO: 211: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 938 base pairs 

(B) TYPE: nucleic acid 
15 (C) STRANDEENESS : double 

(D) TOPOIiDGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 211: 
20 GGCACAGGAA AAAAAAGAAA AAAGAAAAAA GAAAAAAGTT TTTCTACCCA CAGATTAGCA 60 
TrrrCTTCAT OTITCAAAAA AGTITAAGCT ATGTCCTAAT TTAAAAATGA GCACAAACTA 120 
CTTAACAGAT CTCTOTTCCC TCTTCTCTTA CTTAAATTAT CnTATITrC ACCATCACCT 180 
CCCACTTOCCG AACACCTCAN CTCTCTGTTT TGTGGTIQGA TCCTGGC3TTG CCAAGTTCCT 240 
ATTTGCTICAG TCCCTGGCCT CTXSGGGCGGT CTCAGGAAGT GGCATGCTCT TCAJGRAGGA 300 
30 TCCTTCAmrr CCAGTATAAC CAWrmCTTA ATAATAGTTG ATAATTCCCA GCTTTTACCA 360 
GATCARTITr GACTTATTOr TCCTCCTTrG ACCrGTTCAA AGCTAACATA TCTOQGTCAG 420 
TTCGGAGAGG GTOGGCSGATT TGAGAATGTG AGGAGGAGTG GGGTTAC3AT GGGnTTGCCT 480 
ATCTOGGCAA GGAAAGAGTTT CCTAGTCGAT TGGC3CACAAT GACAAAATGA TTCCATC5GAT 540 
AGAATCC?rCC CATOTTGCTC GAACACCTCA aSTGTTGTGA ACGCCTTAAA TTCCTGCCAT 
40 CCCnCTCTG A3TCCCCACC TCCCTGTAGT TTCCACAGGA TTTATCTCTC TGTACCCCCG 

TCCrcCAACT CTACTCTCTC AGCCICTCCT CCATCCCTTA CTTCCCTTCT AAATTCCAGG 720 
AGATOACCTC ACTTTCCAAA GCAAATIGGA GCCACCAAAT TGTAGCTCTC CTCGGTC3GAA 
ACTGCATCTC TCCTCATCCC TOZACCTTCT TC3CAGAAAGC CGCCCCCTCA GGCCAAGATG 
AOTGCCreGC CCCCATC3GGA GACTCAGACA CTTTGACCCC TTCTGACnC AGCATCTCCC 

_ 938 

50 TCTTTAAAGA TTCTCTCCCA ACATTCAGTC CTTGCTCGA 



600 
660 



780 
840 
900 



55 (2) INFORMATION FOR SEQ ID NO: 212: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LQIGTH: 1551 base pairs 

(B) TYPE: nucleic acid 
60 (C) STRANDEDNESS: double 
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15 
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25 



30 



35 



40 
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50 



55 



(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 212: 
AGGCTGGACT AAGCATAGAG AACCAGGAGA GAAAGAAAGA TTTAAGAGAC Ta^^TTAATAT 
TTTITCACAG ATCATTTAAG AAACTGAC3TA ATITrTrrTT TCTCCAAAAG .GGC^TGGGTT 

rrTnTTTCT rriuiT rnT crcTATTTGG cactttctag ggattggtct ATAAAirnr 

TGAAAGATCA TAGGATAAAT TTCTTTGTAG CAACTTCCTA TnTAGTGnT TATGITAGGG 
GARCCCCARG TOTCCCTCCT GATACGCCAT TAGGGCCACT TCTCAGCCTC TGGCTACATC 
ATAATOCm TTTTTCTATC TTGCCAAAGT TTCCM3AAAA TTKAKGTTrr CTAATTITAA 
AAAAKlTOCyr TCIGGAGATC GGATGGGACC TCTITATAAG CCCTGAAAAT AAGriGATmi 
TrrTAAGTCC TATTCTOCTA TAAACCTGAT TCTCACTTTT TTCTGTAGAC AAC^GrmT 
TATAATATAT CTAimCTG TCGACATTAT TTCCrTTTAA CCAATACTGA AATTCCATAG 
TCTAWACTIT CTCCACATTT TCrTTGATTA ATACTTYCTT AAAATAGACA CTTGGATTGG 
CACCAGCTOT CACCAATAAA GCPGCCCTGA ACATTGTCAA TCAATCCTGT TAACCAATTT 
GAGAATTrrr CTOGAA'TOCT TAGriTAGGGA TGAAATTCCT GGGITATAGG TA'KSGTATG 
CTIGATATAC TmCTCCAG AATGTCTACA CCTOTGTCTA CACCACATCT CCAGAGATAG 
GGGAATCrrA TCTCCCTCCT AACTGCTCTC GrTTATTTAAT TTTCTGACAT TTGCCGCCGC 
CX3CCGCCCCC TGOXCCAAC ACACACATGG TATAAAGTGG TAGrTTTCTTG TTTTAAATTG 
AACTTITCAA TGATTTCAAT TTGGGCATTT CTTTGTATCC TGAGTTATTT TQGTTTCCCG 

TEATCravAT ATCcrrrrcc tatgctttaa ctacttttct AATTrcrrccc ttttttnggt 

TATCAAATTC CfiGGCCATTG TCTATTCCAT CGTCACrTTT GGGTATTGGA AACATCITTC 
CATTCTOTAG CCTGrrCTGTT GAACATAAAT CTTGATTTTT ATGTAATCAG AnTTTCTCC 
aTACGCTTAT GTTCTTCGAA TrrTATTTAA GAAATCTTET TCTATCCTGA GACCACAAAA 
ATCTCCCCAC CATTnCTTC TCnTTCATAG TTTTGCCTTG TATGTTTAAT CCTTTAAGGC 
ATCyiCTACJIT CATTTrATAT GG?rGTGAAAT AGTTCTTATT CATTTATTCA ACACATATTG 
GIGGACTGCC TQCTCATCGT AGTACrCTTC AGAGTACTTT GTATATAITr GTCAACACAT 
AnCTTGCCC TOGAAGCTTA TGTTGTCNTr CAAGGTAGAT COnACTCGG TTTCCACCTG 
TTTTCrTCAG CCCTCAGGAT GAATTCCACA AaTTTACACA TAGCACCAGT TAAGGAATAG 
GCrrTATTQG AGAAAAGGAA GGCTTATTAG ACCAGCATCA GCAAAAAAAA A 



50 
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60 (2) INFORMATION FOR SEQ ID NO: 213 
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(i) SEQUQJCE CHARACTERISTICS: 
50 (A) LENGTH: 1495 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOUOGY: linear 

55 (xi) SEQUENCE DESCRIPTION: SBQ ID IXiz 214: 

GAATTCGGCA CGAGK3ACCA CAGATATCTT TGCSCTTTCP-iS CCTCPCG^r;^. ATGCICTCCA 
CTATOTTTTT TTTAATOGAT TGACATCTCA TGAATCCACA AATTT^AGCCG CTTTTCCVTC 



50 
120 
130 
240 
300 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 997 case pairs 

(B) TYPE: nucleic acid 
5 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTICN: SEQ ID :X: 213: 
10 AGAGACTCCT CAACAGAACC TAATCATGCT GGCACCCTAA CCTC-.TP-rr? CTAGCC?CCA 
GAACTGAGAG AACATAAACT CCAGTTGTTT AAGCZACCGV C^CT^^rGGCA -TTGrTTATTA 
TAGCCCAAGC TAAGTCAGGT GGAAAGGC?a3 AAATATTTTO ASA-^-^-'rC.-. rrTCTAC-AA 
AACAGACTTG TTCTAAATGA AATQGCCAGA TATT7CATC? Zm^-ST^ST^ ==GTA1TIArG 
AAACTTTCAT TAAACACCAC TTGGCCAGCA CCCAGGCC^^J CCACCCTC^i^ .=ACGGCA?^r 
20 AAAAGCAAAT GATTTGAGGA ACAAAAGACT GGAC^yCAGAG CCTCrc;yS?iA GATGGCICCA 360 
TCTTCTGAGA TCATCTTCTG AGATCATCAA TTTTCTGC?!: Cia-ZtnCCT .iCTCCAiCTG 420 
TACTTAGATAA GAGCAAAGAC ACTTCCTGAT CCTGrTGGAAA ^OOCZr^^ CCTGCTa-.rG 
GAGAGGCTCA CACTGGGACC AACAGAAGGC CGG^XZATTTA irTTC-nGC^ ZCCTTCIGCA 
CCTGGGCCCT CTTCAGGCCT TGTACCITGC ACTCCCCATG CC-^G^^ ACCTGGZSJiiS 
30 CTCAAGTTAG GTATTTGAAG AGATAATrTG CCCCCAACAA AG^^^ACTT AAAAGAA.^--A 
GGAAACCACT AAATTCCACT TGACAAACCA GITTGTTa-^ ^XSCAAATTTG 
AAACTTTCTC TTTGGCACCA TATGATTCTG TTACAITAGG GrTC^-rCAAT GCTAAGATAC 
ACAGCTAGGT CTACCAGCTG CCAGTGGTCA AaAA-TSAAAG A.=-rcrCTCA3 .^y3AG?ISArCA 
GriTTCTAATA ACCTAACAGT TTTdnTGGS TATTACMA.-A AAAPAAA?:.---. TTAGAAT^-AA 
40 ATGTCAGTGC CATGCAGGCA AGTACAGATA TGGAAATGAA AGCtCrGTCT ^^lAACrGCAA 



430 
540 
500 
550 
720 
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840 
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960 



GATTTGrrTTG TTAATAAAAT TGATTGGGAT CACTCGA ^^"^ 



(2) INFORMATION FOR SEQ ID NO: 214: 
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TTTTCCATCT TTCTCATAGC TTCATCACGC ACGATC5GAGG TCACTTCAGC ACTATCOGGA 



180 



540 
600 



GCGGCCTCAC GGACAGATCR GTCAATTTCC rmiA: m T TCTTGATGTA OCGGATTGTC 240 
GACTCOTTAA CAITCAGCTC AlOSCCAACA GCACTCTAAC TCATGCXTTGA TTC5GAGCTTA 300 
TCCAACACGC GGAMTTTCTC CGTTAAGGSAM ATCAMCSGTCT TCTTTCGCTT AGGAACACTG 360 
GGCARAROTT AARCACTACG CITOGGGGCC ATTTTAGAAA GCAAAACCAC CCACAAAAAG - 420 
CAGAAAAAAA ACTTCTCAGrTA AACAGACTOJ NGANAGGACT CTTTGTrTAC AGCACAGGAG 480 
CrcCGACTAG AAGQCGQCXX: TTCTCCCCAG TTCAAACTTC AGCTGGGAAC CTTACCTCCX3 
15 CCAACrcCAA ATTTTCACCC TCTGCGCATG CCCGGGAAAS AAACCCCCAG AACAGTACOG 

TCATCATTGA TITrAGGG?rT ACAAATACAT TTTAGCAACT AAGTGAATTT GGCATTACGA 660 
ATTAATGATT AATGAAGGTC ACCTGTATTT CCATAGATAT GrrAAnTTAT TTAAGCAGCyr 720 
TrATTATATT AAGGCGGSGA GGCAGCGCCG AAGACTACAA GrTTCCAGCAT GCACOGCGTC .780 
CX3GGCGGCTT CQGCSCTCCCA GCGAGCXSCTT CAGGGAOGCC AGCCXXX^GG CATCGGCCGG 
25 AftroSTCOTA GGC3CAACCAC CyEAGTACTCT CTGCGCATCT GCAAAGCX3CT GTCQGGGGCC 
OXCTAGCTG CCGTCGCCGC CGCCGGGGCT CTATGGTCTC TCCCTAGAGC TTTGCCGrTTG 
GAGGCGXXrre CTGCGGTCTT GTGAGTTTGA CCAGCGTCGA GCGGCAGCAA CATQGAGGAA 

30 

rrCGACTCCG AAGACTTCTC TACGTCGGAG GAGGACGAGG ACTACGTGCC GTCGGGTCAG 
CGATTCCGCC TGAGGCGAGA AGCGAATTGC CCCGCCCXIAC GCCTCACGTG AGGCGCGCTC 1140 
35 TGCCCCCGCG GGGOTCIGCC CTGTGGCCX:A GGTQGTCCAG GGGGGCTCCT GTTCTCGAGC 1200 
OTCCGCICCC TCAGGCCCCT CATCCTO3GC CGCTCCGGCC CGAGGCGTGT GCGCGTGGOG 1260 
CTTCTOIGCT CXXXrrCCCGT TOGGCAGCTC CGGCX:GCCGC CCCCTCTTGC AGCGCGQGAA 1320 

40 

CGGCACATQG ACAOSGCX^CC TTGTCGCTAG GGAOSCTCGT CGGTCAGCCC CGAACGACAA 1380 
CGCTGCTTCA GAAG3XXXX3G CGGCAGTTCG AGCCTTGGAA GrmTTTTCA GCCXTOGCXX: 1440 
45 GAGAGAGCTG CTOGCCAACA ACXXXTTCCAA. GATAGAGCTG TCCQJTCTCC CSKCTGG 1496 



840 
900 
960 
1020 
1080 



50 (2) INFORMATION FOR SEQ ID NO: 215: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1308 base pairs 

(B) TYPE: nucleic acid 
55 (C) STRANDEDNESS: double 

(D) TOPOUOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 215: 



60 TTCGCANCNG GGAGAGGGAA AGAGGAGGAA ATQGGGTITG AGGACCATGG cttacct 
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(2) INFORMATION FOR SEQ ID NO: 216: 



CTGCCnTCA CCXJVTCACAC CCCATTTCCT CCTCTTTCCC TCTCCCX33CT GCCAAAAAAA 120 

AAAAAAAAGG AAACGTITAT CKTGAATCAA CAGGGTTTCA GrTCCTTATCA AAGAGAGATG 180 

^ TGGAAAGAGC TAAAGAAACC ACCCTTrGTT CCCAACTCCA CTTTACCXM' ATTrTATCCA 240 

ACACAAACAC TCTCCTTTTC GGTCCCTTTC TTACAGATGG ACCTCTTGAG AAGAATTATC 300 

10 OTATTCCACG mTTAGCCC TCAGGTTACC AAGATAAATA TATGTATATA TAACCTTTAT 360 

TATTCCTATA TCTTrGTCGA TAATACATTC AQGTGGrrGCT GGGTGATTTA TTATAATCTG 420 

AACCTAGGTA TATCCnTGG TCTTCCACAG TCATCTTTGAG GTGQGCTCCC TGCTEATOGTA 480 

AAAAGCCAGG TATAATGTAA CTTCACCCCA GCCTlTOrAC TAAGCTCTTG ATAGTCGATA 540 
TACTCTITTA ACTITAQCCC CAAIATAGGG TAATQGAAAT TTCCTGCCCT CTQGGTTCCC 

20 CATTTTTACT ATTAAGAAGA CCAGrTGATAA nTAATAATC CCACCAACTC TGGCrTAGTT 
AflCJIGAGACTr GTCAACroTG TGGCAAGAGA GCCTCACACC TCACrAGGrTC CAGAGAGCXX: 
AGGCCITATG TTAAAATCAT QCACTTGAAA AGCAAACCET AATCTGCAAA GACflGCAQCA 



600 
660 
720 
780 



AGCATTATAC GGTCATCITC AAIXSATCCCT TTGAAATTTT TTTTITCm GrmCTrTAA 840 



900 



ATCAAGCCIG AGGCTGGPTCA ACAGrTAGCTA CACACCCATA TTGlX?rGrrrC TGrTGAAOXSCT 

30 AGCTCTCrrG AATTTCGATA TrGGPTATTT TTTATAGftGT GTAAACCAAG TTTTATATTC 960 

IXXrAATCCGA ACAGGTACCT ATCTCTTTCT AAATAAAACT GTTTACArrC ATTATGOGGT 1020 

ATCTATGACC TTCATTTTCC AAGAAATAGA ACTCTAGCTT AGAATTATGG ATGCTCTAAA 1080 

35 

ATOTCAGAAT GGGAACTCTC CTCGAACTTTC TCCCAAACTC AGAGACAGCA CTGCCTTCTC 1140 

OTAAATGATT ATTCTTTTCT CXXTCTTITC TGGrrATTrTC TAGGCATCCT TCTCACXIACA 1200 

40 GCXATAACCC TTTTTTACTT CCATTAGGCC GTATAACTGG NGGGACM3CT QGTCGGTATA 1260 

TAATACTOCyP WCCAACAMAG GGGTTCTGGA TGTACACMAG GTTATCTT 1308 



(i) SEQUENCE CHARACTERISTICS: 
50 (A) I£NGTH: 1705 base pairs 

(B) TYPE: nucleic acid 
.(C) STRANDEENESS : double 
(D) TOPOLOGY: linear 

55 (xi) SEQUENCE DE9CRIPTICN: SBQ ID NO: 216: 

TOGCCAOXSGA AGCGCTAGAA GGTTTAGATT TTGAAACAGC AAAGAAGGAT TTCCTTOGAT 60 

CTOGAGACCC CAAAGAAACA AAGATGCTAA TCACCAAACA GQCTGACTGG GCCAGAAATA 120 

60 
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30 
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TCAAGGAGCC CAAAGCCGCC GTCGAGATOT ACATCTCAGC AGGAGAGCAC GTCAAGGCCA 



AAAAAAAAAA AAAAAAAAAA AAANA 

55 

(2) INFORMATION FOR SEQ ID NO: 217: 
60 (i) SEC^JENCE CHARACTERISTICS: 



180 



TCGAGATCTC TCGTCa^T GGCTGGOrTG ACATGTTGAT CGACATCGCC CGCAAACTGG 240 



600 
660 



ACAAGGCTCA GCGCGAGCCC CroCTGCTGT GCGCTACCTA CCTCAAGAAG CTGGACAGCC 300 

CTCGCTATC5C TGCTCAGACC TACCTCAAGA TGGGTGACCT CAAGTCCCIG GTGCAGCTQC 360 
AGTOGAGACC CAGCGCTOGG ATCAGGCCTT TGCrTTGGGT GAGAAGCATC CTGAGTTTAA . 420 

GGATCACATC TACATGCCCT ATOCTCAGTG GCTAGCAGAG AACGATCGCT TTGAGGAAGC 480 

CCAGAAAGCG TTCCACAAGG CTCGGCGACA GAGAGAAGCG GrrCCAGGTGC TGGAGCAGCT 540 
15 CACAAACAAT GCaJTOGCQG AGAGCAQGOT TAATGATGCT OCCTATEATT ACTGGATGCT 
fflCCATOCAG TGCCTCGATA TAGCTCAAGA TCCTGCCCAG AAGGACACAA TGCTTGGCAA 

CTTCTACCAC TECCAGCGTT TOGCAGAGCT GTACCATGGT TACCATGCCA TCCATCGCCA 720 

20 

CACGGAAGAT COCnTCAGTG TCCATCGTCC TGAAACTCTT TTCAACATCT CCAGGTTCCT 780 
GCTCCACAGC CTGCCCAAGG ACACCCCCTC QGGCATCTCT AAAGTGAAAA TACTCTTCAC 
25 CTTOGCCAAG CAGAGCAAGG CCCTCGGrTGC CTACAGGCTG GCCCGGCACG CCTATGACAA 

GCrcCOTCGC CTCyrACATCC CTGCCAGATT CCAAAAGTCC ATTGAGCTGG GTACCCTGAC 960 

CATCCGCGCC AAGCCCTTCC ACGACAGTTGA GGAGTTGGTC CCCTTGTGCT ACCGCTGCTC 1020 

CACCAACAAC CCGCPGCTCA ACAACCTGGG CAAtelCTGC ATCAACTGCC GCCAGCCCTT 1080 

CATdTCTCC GCCTCrrCCT ACGACGTGCT ACACCTGGTT GAGTTCTACC TGGAGGAAGG 1140 



840 
900 



35 GATCACIGAT GAAGAAGCCA TCTCCCTCAT CGACCTGGAG GTGCTGAGAC CCAAGCGGGA 1200 
TGACAGACAG CTAGAGATTT GCAAACAACA GCTCCCAGAT TCTTGCGGCT AGTQGGAGAC 1260 
CAAGGGACrc CMCQGAGAT NAGGACCCGT TCACAGCTAA GCrRAGCTTT GAGCAAGGTG 1320 

40 

GCTCARAOTT CGIGCCAGTG GTGGTCAGCC GGCTQGTGCT GCGCTCCATG AGCCGCCGGG 
AlCTCCTCAT CAAGCGATGG CCCCCACCCC TGAGGTGGCA ATACTTCCGC TCACTGCTGC 
45 CTGACGCCrc CAITACCATC 1GCCCCTCCT GCTTCCAGAT GrTTCCAlTCT GAGGACTATG 

AOTTCCTOCT GCTTCAGCAT GGCTOCTGCC CCTACTGCCG CAGGTGCAAG GATGACCCTG 1560 
GCCCATGACC AGCATCCTOG GGACGGCCTG CACCCTCTGC CCGCCTTGGG GTCTGCTGGG 1620 
CTOTGAAGGA GAATAAAGAG TTAAACTGTC AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1680 



1380 
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1500 
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(A) IiEMGTH: 999 bsise pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 217: 
AGCAAATCAC CITAACGATC TGGAATGAAA CTGTGACCAG TGCCGCCCTG GGTGGTTCTG 60 
10 GAGAGACTGC CGTCTTCTrG TTTGGCCATA QGTCCTGGGG CCCCGGCTTC AGTCACTGTC 120 
TCAGACAGKA OTCCCGATAA GCAGATCACC AGTCCTOCAC TGTCCITCCT GTCC3GCCTTG 180 
CTCCATGAGA AGATAGCTCC TTCCTCCCTC TTTTCCTACA CTGTAAATTA TTC3TTTTACA 240 
AITCACyrcVC TTAATAATAG TYTACAAATA CTATGTATTT ATGCAAAACT GTTAAAGrTTC 300 
TCATCroiTA TGATTC5GATA CTTGGTCTTG TCAGTAGTCG TCAGCAITGG GTTGTGAGCT 360 
20 TOTTCTACTC CATACCTCHT TA3XXTGCTA TGCATTTTAC ATTOrGTGTT CACATCTATT 420 
CCAAGGAGCC TTCCTAGAAA CAACACTGGC GGTTCCTGCA GGCCAGGCAG GCATTGGCCC 



15 



25 



480 



ATCCICTGTC CCATAGGAGC CAATCGAAAG AACGTAGCTT GGTCTGCTAG CCAGCCGTrGG 540 



600 



GCnXXiCGCAG GCCAGGCAGC CTCTGCACCA GAGTCCAGCA CCTGCCCATT CCCCAGTPCAC 

ACAATCATAC TCITCTrrCA TAGAGATTTT ATTACCACCT AGACCflCCCT AGTTTTCCTC 660 

30 TCrCTTAGTO TCCTCAGCTC TTITCCAACA AAATC3TAGGT ACAGACCAAT CCCTCTCCCT 720 

TCCCCAATCA GGAGCTCCAC AOCATGAGTT GmTCGTTTT CCAGAAGCTG CCAGTGGGTT 780 

CCCCHGAATT GCGTTAAGAT ATCGATCATK TTTTTTATTG TTTTrCTTCT TGTTnTrTA 840 

35 

AATAATATAT TTAAAGGCAG TATCTITICT ACTGTCAATT TGCAGTAGAA GATGCAGAAT 900 

QCAcrrrrrT TrrAcrrcTG ttogtgtgta ttgtatatag tgtgtgtgct TcrrcrrGATG 960 

40 AAAATAAACT TnTCTTTAT AAAAAAAAAA AAAAAAAAC 999 



45 (2) INFORMATION FOR SEQ ID ^K): 218: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENCrTH: 941 base pairs 

(B) TYPE: nucleic acid 
50 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 218: 
55 GGCACGAOTA GCAITrCATT TAATCTGCAG GTATATTCTC CCAACAGmT AaTOTCATGT 
GATCTCCTCA GCCAAGATTC TRAQGCAGAG AOGAGCTGTC CCAACCTACT ATACCACCGA 



60 
120 



60 



GGCTCGAGAG ATCATATTTT TC3GTATTAAA CTGGAGTCTC TCCATCCTTC ACATTGTTGA 180 



wo 98/54963 



PCT/US98/11422 



471 



TGKxrrcrCT agcaaaccgg AAAAGTCAGT GACAGAAGAT GCCGCTAGCG GTTTGAGCCA 240 

GAGAATCACA GCTCTGGTTT C3GAGAAAAGG GCCGGATGGT C3GCTCTAGAA AGCCCATCCT 300 

TCTCcrcrrc Trrrrrcrcc oxttatatt GrixxrrTTCAT tcattcattc attcatcaaa 360 

CAnronGA GCACCTATTA TGTCTCAAGC TCTGTCCTAG CCTCTGGAAA ACCTGCCCTC 420 
ATOTAGCrCA CTCTOGACTTA GGAGAAACAA TGACTACACT ATGATAAGCA CGGGTTGTCA - 480 

GGGTCTCACA GAGCAOTOGC CCCTCATCCA GACCGATGAG GTCAAAGAAG GCATCCAGGC 540 
GAGGATQGTG TCAGAGCTAA CTGAAGAATG AGAGGGAGCT GCACCASCAG QGCnTGGAAC 
15 TGAAGCTGGC AGTCCCIGGA GTCTTGATTC CAGCAGAGGG AGAGCAGTCT GTGAAAAGGC 



10 



20 



45 



GCSGCAAAGCT AGAGAQGTAA GAAGAATCTA CAAATGTTCC TCGAGTTACA TGAACTTCCA 
TCCCAATAAA CCCATTGGAA AOGAAAAATT TAAGTCAGAA GTGCATTTAA GGCTGGTCCG 
ACTAGAATCA TTrTTACAAC GAATTGATCA CAACCAGTTA CAGATGTCTT TOTTCCITCT 



600 
660 



ACCAAGGC?rG GGAGAGGGCA GAGCACATGG AGGAACTTCA GGTAGTICTG GATGGCSCTG 720 



780 
840 
900 



25 CCACTCCCAC TGCTTCACCT GACTAGCXOT TAAAAAAAAA A 941 



30 (2) INFORMATION FOR SEQ ID NO: 219: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 575 base pairs 

(B) TYPE: nucleic acid 
35 (C) STRANDEDNESS : double. 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 219: 
40 TAAC?IX3GAAT CCCCCXSGGGT TGCAGGGAAT TOGGCAOGAG GCATTCTGAG AAGCTTAAGA 60 
CATACTTIGA AGACAACCCT AGGGACCTCC AGCTGCTGCG GCATGACCTA CCTTTGCACC 120 
CCGCAGTCGT GAAGCCCCAC CTGGGCCATG TTCCPGACTA CCTGGTTCCT CCTGCTCTCC 



180 



GIGGCCTOCT RCGCCCTCAC AAGAAGCGGA AGAAGCTGTC TTCCTCTTGT AGGAAGGCCA 240 



300 



AGAGflGCAAA GTCCCAGAAC CCACTGCGCA GCTTCAAGCA CAAAGGAAAG AAATTCAGAC 
50 CCACAGCCAA GCCCTCCTCA GCSnTOTTGGG CCTCTCTGGA GCTGAGCACA TTGTGGAGCA 360 
CAGGCTTACA CCCrrCC7K3G ACAGGOGAGG CTCTGGTGCT TACTGCACAG CCTGAACAGA 420 
CA£?nCK3GG GCCQGCAGTO CTCGGCCCTT TAGCTCCTTG GCACTTCCAA GCTGGCATCT 480 

55 

TGCCCCTTCA CAACAGAATA AAAATTTTAG CTGCCCCAAA AAAAAAAAAA AAAAAAAAAA 540 
CICGAGGGGG GGCCCGTACC CAATTCGCCC TATAA 575 

60 
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(2) XNPORMATION FOR SEQ ID NO: 220: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3018 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECNESS: double 

(D) TOPOLOGY: linear 

10 

txi) SEQUENCE DESCRIPTION: SEQ ID NO: 220: 

GCCAGCCTTA CAGGTTTTAC GTGAAATGAA AGCCATTGGA ATAGAACCCT CGCTTGCAAC 60 

15 ATATCACCAT ATTATTCGCC TGTTTGATCA ACCTGGAGAC CCTTTAAAGA GATCATCdT 120 

CATCATTTAT GATATAATGA ATGAATTAAT GGGAAAGAGA TTTTCTCCAA AGGACCOGGA 180 

TGATGATAAG TTTTTTCAGT CAGCCATGAG CATATC5CTCA TCTCTCAGAG ATCTAGAACT 240 

20 

TGCCTACCAA GTACATGGCC TTTTAAAAAC OGGAGACAAC TGGAAATTCA TTGGACCTGA 300 

TCAACATCGT AATTTCTATT ATTCCAAGTT CTTOGATTTG ATTTGTCTAA TGGAACAAAT 360 

25 TGATGrrTACC TTGAAGTGGT ATGAGGACCT GATACCTTCA GCCTACTTTC CCCACTOCCA 420 

AACAATGATA CATCTTCTCC AAGCATTGGA TGTGGCCAAT CGGCTAGAAG TGATTCCTAA 480 

AATTTC3GGAA AGATAGTAAA GAATATGGTC ATACTTTCCG CAGTIGACCTG AGAGAAGAGA 540 

30 

TCCTGATGCT CATGGCAAGG GACAAGCACC CACCAGAGCT TCAGGTGGCA TTTGCTGACT 600 

GTGCTGCTGA TATCAAATCT GCGTATGAAA GCCAACCCAT CAGACAGACT GCTCAGGATT 660 

35 GGCCAGCCAC CTCTCTCAAC TGTATAGCTA TCCTCTTTTT AAGGGCTGGG AGAACTCAGG 720 

AAGCCTQGAA AATGTTOGGG CTTTTCAGGA AGCATAATAA GATTCCTAGA AGTGAGTTGC 780 

TGAATGAGCT TATGGACAGT GCAAAAC?rGT CTAACAGCCC TTCCCAGQCC ATTGAAGTAG 840 

40 

TAGAGCTGGC AAGTCCCTTC AGCTTAOCTA TTrGTGRGQG CCTCACCCAG AGAGTAATGA 900 

GTGATTTTGC AATCAACCAG GAACAAAAGG AAGCCCTAAG TAATCTAACT GCATTGACCA 960 

45 GTGACAGTGA TACTGACAGC AGCAGTGACA GCGACAGTGA CACCAGTGAA GGCAAATGAA 1020 

AGTGGAGATT CAGGAGCAGC AATGGTCTCA CCATRGCTGC TGGAATCACA OCTGAGAACT 1080 

GAGATATACC AATATTTAAC AriCTTACAA AGAAGAAAAG ATACAGATTT GGTGAATTTG 1140 

50 

TTACTGTGAG GTACAGTCAG TACACAGCTG ACTTATGTAG ATTTAAGCTG CTAATATGCT 1200 

ACTTAACCAT CTATTAATGC ACCATTAAAG GCTTAGCATT TAAGTAGCAA CATTGCGGTT 1260 

55 TTCAGACACA TGGTGAGGTC CATGGCTCTT GTCATCAGGA TAAGCCTGCA CACCTAGAGT 1320 

GTOGGTGAGC TGACCTCACG ATGCTGTCCT CGTGCGATTG CCCTCTCCTG CTQCTGGACT 1380 

TCTGCCTrTG TTGGCCTGAT GTGCTGCTGT GATGCTGGTC CTTCATCTTA GGTGTTCATG 1440 

60 
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CAGTTCTAAC ACAGrTTQGGG TTGGGTCAAT AGTTTCCCAA TTTCAGGATA TTTCGATGTC 1500 

AGAAATAACG CATCTTAGGA ATGACTAAAC AAGATAATGG CAGTTTAGGC TGCACAACTG 1560 

5 GTAAAATGAC TGTAGATAAA TGTTGTAATT AGTGTACACG TTTGTATTTT TGTTAATATA 1620 

GCCGCTGCCA TAGTTTTCTA ACTTGAACAG CCATGAATGT TrCATGTCTC CCTTTTTTTT 1680 

TICTCTATAG CTGTTACCTA TTTTAGTGGT TGAAATGAGA GCTAGTGATG ACAGAAGGAT 1740 

10 

GTGGAATGTC TTCTTGACAT CATTGTGTAT TGCTGGTAAT CAAGTTGGTA ACGACTACTT 1800 

CTAGCAGCTC TTACCACTAT GACTTAAGTG GTCCTGGAAG GCAGTAAGTG GAGGTTTGCA 1860 

15 GCATTCCTGC CTTCATGAGG GCTTCTACCA CTGACCACTT TGCACGTACC TGGCTCCXAG 1920 

ATTTACTTAG GTACCCCACX; AGTCGTCCAC ATAAGCAGCT TCATCTTTAC CTTGCCAGAG 1980 

TTGACAATTA TGGGATACTC TAGTCTACTT ATACITOTGT TCCCATCTGT CTGCCATCCT 2040 

20 

CTGAAGGCCA GGACCCAGTC ATACATCCTT AGAAACCAAA GTATGGTTTT TCTTTTCTCT 2100 

TGGAATGTCA GC5TCTTAAGG CATTTAATTG AGGGACAAAA AAAAAAAAAA GCCX5ATATAG 2160 

25 TAGCTAGCTA CTTAAGCATC CATGGGTATT GCTCCATATC AAAGCAGATT TGCAGGACAG 2220 

AAAGAGTAAA TTAGCCTTCA GTCTTGGTTT ACAGCTTCCA AGGAGAGCCT TGGSCACCTG 2280 

AAATGTTAAC TCGGTCCCTT CCTGTCTCTA GTTCATCAGC ACCTGCAGAT GCCTGACTCT 2340 

30 

TGTTAGCCTT ACTATTCAAT ACAGTCCTTA GATTCACGGT ATCCCTCTTC CTATCCAGGC 2400 

ACCTATTCTG AATCACCATG TTGCTCTGCA GCTAGAGTTG ATAGGAGAAA ATCCATTTGG 2460 

35 GTAGATGGCC TATGAATTTG TAGTAGACTT TCAAAATGAG TGATTTGTrA GCTTGGTACT 2520 

TTTAAGTTTG TGGTACAGAT CCTCCAAACC CATACTCTGA GCAATTAACT GCCTTGAACA 2580 

TAGAGAAAAA TTAAGGCCTC ACAGGATGAG TCTCCATTCT CTGTAAATGC TTATTTTATC 2640 

40 

ATAGTCTTTA GCCTCEAACT AITGAGTAAAA TGTTCTCTTC GGCCGGGTGT GGTGACTCAC 2700 

ACCTGTAACC TCAGCACTTT GGGAGGCAGA GGTGGGAGGA TCACTTAGGT CCAGGAGTTC 2760 

45 GAGACTAGCC TQGGCAACAT AGTGAGACAC OGGATCTACA AAAAAATAAA AAGCXAGACT 2820 

GGTQGTATGT ATCTGTGTCC CAGCTAATTG GGAGGGTGAG ATGGGAGGAT TGTTTGAGCC 2880 

TAGGAGAGGG AGGTTGCAGT GAGCCGTGAT CGCACCACTG CACTCCAGCC TGGGCAACAG 2940 

50 

AGCAAGACCC TGTCTTGGAG AAACCAGAAT TTTGGAAGAG CAAAOXSGGGC TGAGTGCAGT 3000 

GGCTCATGCC TGTAATCC 3018 

55 

(2) INFORMATION FDR SEQ ID ^S0: 221: 
60 (i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 968 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 221: 
GGCACGAGGG CCGCGGGACA TCCAOGGGGC GCGAGTGACA CGCGGGAGGG AGAGCAGTGT 60 
10 TCTGCTGGAG CCGftTGCCAA AAACCATQCA TTTCTTATTC AGATTCATTG TriTCTTTTA 120 
TCTGTQGGGC CTTTTTACTG CTCAGAGACA AAAGAAAGAG GAGAGCACCG AAGAAGTGAA 180 
AATAGAAGTT TTC5CATCGTC CAGAAAACTG CTCTAAGACA AGCAAGAAOG GAGACCTACT 240 

15 

NAAATGCCCA TTATGACGGC TACCTGGCTA AAGACGGCTC GAAATTCTAC TGCAGCCQGA 300 
CACAAAATGA AGGCCACCCC AAATGGTTTG TTCTTGGTGT TGGGCAAGTC ATAAAAGGCC 360 
20 TAGACATTGC TATGACAGAT ATGTGCCCTG GAGAAAAGCG AAAAGTAGTT ATACCCCCTT 420 
CATTTGCATA CGGAAAGGAA GQCTATGCAG AAGGCAAGAT TCCACCGGAT GCTACATTGA 480 
TTTTTGAGAT TGAACTTTAT GCTGTGACCA AAGGACCACG GAGCATTGAG ACATTTAAAC 540 

25 

AAATAGACAT GGACAATGAC AGGCAGCTCT CTAAAGCCGA GATAAACCTC TACTTGCAAA 600 
GGGAATTTGA AAAAGATGAG AAGCCAOGTG ACAAGTCATA TCAGGATGCA GTTTTAGAAG 660 
30 ATATTTTTAA GAAGAATGAC CATGATGGTG ATGGCTTCAT TTCICCCAAG GAATACAATG 720 
TATACCAACA CGATGAACTA TAGCATATTT GTATTTCTAC TTTTTnTTT TAGCTATTTA 780 
CTGTACrrTA TGTATWAAAC AAAGTCMCTT TTCTCCMAGT TGKATTTGCT ATTTITCCCC 840 

35 

TATGAGAAGA TATTTTGATC TCCCCAATAC ATTGATTTTG GTATAATAAA TGTGAGGCTG 900 
TTTTGCAAAC TTAAAAAAAA ATTTAAAAAA ACTGGAGGGG GQCCCGTACC CAAmXXXXX5 960 
40 NATATGAT 968 



45 (2) INFOKMATION FOR SEQ ID NO: 222: 

(i) SEQUENCE CHARACTERISTICS: 

(A) I^ENGTH: 1404 base pairs 

(B) TYPE: nucleic acid 
50 (C) STRANDEEMESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 222: 

55 CGTTTTCCGG COGTGCGTTT GTCGCCCTrCC GGCCTCOCTG ACATGCAGCC CTCTGGACCC 60 

CGAGGTTGGA CCCTACTGrTG ACACACCTAC CATGCGGACA CTCTTCAACC TCCTCTGGCT 120 

TGCCCTGGCC TGCAGCCCTG TTCACACTAC CCTCTCAAAG TCAGATGCCA AAAAAGCCGC 180 



wo 98/54963 



PCT/US98/n422 



475 



10 



15 



20 



25 



30 



35 



40 



CTCAAAGACG CTTGCTGGAGA AGAGTCAGTT TTCAGATAAG CXX3GTGCAAG ACCGGGGTTT 240 

GGTGGTGACG GACCTCAAAG CTGAGAGTGT GGTTCTTGAG CATCGCAGCT ACTGCTCGQC 300 

AAAGGCCCGG GACAGACACT TTGCTGGGGA TGTACTGGGC TATCJTCACTC CATGGAACAG 360 

CCATC5GCTAC GATGTCflCCA AGGTCTTTGG GAGCAAGTrC ACACAGATCT CACCXX7PCTG 420 

GCTGCAGCTG AAGAGACGTG GCXX3TCAGAT GTTTGAGGTC ACGGGCCTCC ACGACGTGGA - 480 

CCAAGGGTGG ATGCGAGCTG TCAGGMGCA TGCXAAGGGC CTGCACATAG TGCXTTCGGCT 540 

CCTGTTTGAG GACTGGACTT ACGATGATTT CCGGAACGTC TTAGACAGTG AGGATGAGAT 600 

AGAGGAGCTG AGCAAGACCG TGGTCCAQGT GGCAAAGAAC CAGCATTTCG ATGGCTTCGT 660 

GGTGGAGGTC TGGAACCAGC TGCTAAGCCA GAAGCGCGTG GGCCTCATCC ACATGCTCAC 720 

CCACTTGGCC GAGGCTCTGC ACCAGGCCCG GCTGCTQGCC CTCCTGGTCA TCCCGCCTGC 780 

CATCACCCCC GGGACCGACC AGCTGGGCAT GTTCACGCAC AAGGAGTrTG AGCAGCTGGC 840 

CCCCXTTGCTG GATQGTTTCA GCCTCATGAC CTACGACTAC TCTACAGCGC ATCAGCX7PGG 900 

CCXTTAATGCA CCCCTGTCCT GGGTTCGAGC CTQOGTCCAG GTCCTGGACC CGAAGTCCAA 960 

GTGGCGAAGC AAAATCCTCC TGGGGCTCAA CTTCTATGGT ATGGACTACG OGACCTCCAA 1020 

GGATCCCCGT GAGCCTGTTG TCGGGQCCAG GTACATCCAG ACACTGAAGG ACCACAGGCC 1080 

CCGGATGGTG TGGGACAGCC AGGYCTCAGA GCACTTCTTC GAGTACAAGA AGAGCOQCAG 1140 

TGGGAQGCAC GTCGTCTTCT ACCCAACCCT GAAGTCCCTG CAGGriGCXSGC TGGAGCTGGC 1200 

CCGQGAGCTG GGCGTTGGGG TCTCTATCTG GGAGCTGGCX: AGGGCCTGGA CTACTTCTAC 1260 

GACCTGCTCT AGGTQGGCAT TGCOGCCTCC GCGGTGGACG TGTTCmTC TAAGCXATGG 1320 

AGTGAGTGAG CAGGTGTGAA ATACAGGCCT NCACTCCGTT TGCTGTGAAA AAAAAAAAAA 1380 

AAAAAAAAAA AAAAAAAAAA AAAA 1404 



45 



50 



55 



60 



<2) INFORMATION FOR SEQ ID NO: 223: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 707 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 223: 
NGCGCGCCTG CAGTCGACAC TAGTQGATCC AAAGAATTCG GCACGAGGGC AGGTCCAGGG 
CTCAGAAATC AGCTCTATTG ACGAATTCTG CCGCAAGTTC CGCCTGGACT GCC0GCTC3GC 
CATGGAGCGG ATCAAGGAGG ACCGGCCCAT CACCATCAAG GAOGACAAGG GCAACCTCAA 



60 
120 
180 
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CXX3CTGCATC GCAGACCJrGG TCTCGCTCTT CATCACGGTC ArGGiC.---iX: TSCC-CCTGGA 
GATCCGCX3CC ATGGATGAGA. TCG^SCCCG;^. CCTGCGAGAG CTGA.TGG.^ CCL^ZGCACCG 
CftTGflGCCAC CTCCCACCCG ACTTTG^dGGG CC3CC^GACG GICAGCC^^ GQCTGCAGAC 
CCTGAGCGGC ATGTCGGCGT CAG^JTG^JGCT GG.^JCG;CTC^^ CAGGTGCtrrC i^-JrC-CTGTT 
CGACCTGGAG TCAGCXTTACA ACGCCTTCAA CCGCTTCCT3 C-~GCCr:L=;G CCCGGGGCAC 
TAGCCCTTGC ACAGAAGGGC AGAjSTC^GAG GCGATGGCTC CTGGTCCCCT GCCCGCCACA 
CAGGCXGTGG TCATCCACAC AACrCACTGT CTGCAGCTGC CTGTCTG^^G TCI^CTTTG 
GrrCTCAGAAC TTTTGGGCCG GGCXX:CTCCC a=CAAT.WJG ATC-CTCICCX3 ACCrTCAAAA 
AAAAAAAAAA AAAAACTCRG GGGGGGCCXX; GTCCCAATCC CCCCMEC; 

(2) INFORMATION FOR SEQ ID NO: 224: 

(i) SEQUENCE CHARACTEHISncS : 

<A) Ii2X3TH: 1334 case pairs 

(B) TYPE: nucleic acid 

(C) ST?.ANEacrSS: 5cuble 

(D) TOPOLOGY: linsar 

(xi) SEQUENCE DESCBl^ld:-. SEQ ID tX>: 224: 
GGGGAACTGC AGTCACAGCA GGAGrA;^^;^^ rSGGAGQCAG GkZl^X^jyrXi Ga-.airAGGT 
ATGGAGAGGG QGTTCAGCGA GCCTAG^ySAG GGCAGACTAC CAGGGTGCrG GCGGTGAGAA 
TCCAGGGAGA GGAGCGGAAA CAGSJ^JGGG GCAGAA3ACC GGGGCACTTG TGG:7rTGCAG 
AGCCCCTCAG CCATGTTGGG AGCCAAGCC-. CACTGGCTAC C^^GGTCCCCT AXrACPCTCCC 
GQOCTGCCCT TGGTTCTGGT GCrTCTGGCC CI^SQGGaCCG GGTGGGCCCA GGAGQGGTCA 
GAGCCCGTCC TGCTGGAGGG GG^CTGCCTG GTGGTCrGTG AGCCTGGCCG jyGCTGCTGCA 
GGGGGGCCCG GGGGAGCAGC CCTGGGAGAG GCACCCCCTG GGCGftGTGGC A^^GCTGOG 
GTCCGAAGCC AMCACCATGA GCGGCAGGG SU^ACCa^C^ ASGGCiCCAK TGGGGCCATC 
TACTTCGACC AGGTCCTGGT GAACGAGGGC GCSnGGCTTTG ACCGGGCTIC TGGCrCCTTC 
GTAGCCCCTG TCCGGGGrTGT CTACAGCTTC CGGTTCCAXG TGCTG^-^GGT Gr.-JCAACCGC 
CAAACTGTCC AGGTGAGCCT GATSCTGAAC ACGTGGCCTG TCATCIOiSC CmOCCAAT 
GATCCTGACG TGACCCGGGA GGCAGCCACC AGCTCTGTGC TACTGCCCTT GGACCCTGGG 
GACCGAGTGT CTCTGCGCCT GCGTCGGGGG AATCTXTCG GIGGTTGGAA AITCTCAAGT 
TTCTCTGGCT TCCTCATCTT C C CTCTCTGA GGACCCAAGT YTTTCAAGCA CAPiGAATCCA 
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GCXXCTGACA ACTTTCTTCT GCCCTCTCTT GCCCCAGAAA CAGCAGAGGC AGGAGAGAGA 
CTCCCTCTGG YTCCTATCCC ACYTCTTTGC ATGGGAMCCT GTGCCAAACA CX:CAAGTTTA 
AGARAARARY ARARCTGWGG CAGCTTATACA GAGCTGGAAG TGGACCATGG AAAACATSGA 
TAACCATGCA TCYTCTTGCT TQGCCACCTC CTCAAACTGT CCACCTTTGA AGTTPGAACT 
TTAGTCCCTC CAMACTCTGA CTGCTGCCTC CTTCCTCCCA GCTCTCTCAC TGAGTTATYT 
TCACTGTACC TGriTCCAGCA TATCCCCACT ATCTCTCTTT CTCCTGATCT GrTGCTGTCIT 
ATTCTCCTCC TTAGGCTTCC TATTACCTGG GATTCCATGA TTCATTCCTT CAGACCCTCT 
CCTGCCAGTA TGCTAAACCC TCCCTCTCTC TTTCTTATCC CX3CTGTCCCA TTGGCCCAGC 
CTGGATCAAT CTATCAATAA AACAACTAGA GAATGGTGGT CAAAAAAAAA AAAAAAAAAC 
TCGA 

(2) INFORMATION FOR SEQ ID NO: 225: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 760 base pcu.rs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 225: 
GGGTCGACCC AOGCGTCCGC TGACCAGTCC GTTATAGATA 
GTTTAAACAG GTGCCACCAC AAGGGATGTC GTOCTTACTC 
CCTTTGTGGG AAARGTCTCT GGGCAAGCAC GTGGTATTTG 
TTTCCACCAG GGATGTTGTG ATCATAAGTC AAAACAACAG 
CTATTGTGGC CTGAGCACAA TTGAAATCTA GCAGAGTTTT 
ACTCTTCTGC TTCTCTGTCA CTTACAATTC AGGTTCTGCC 
GAAGAGTCCT CATGTGACGC TTAGTTCTAT TGCAGTCCTG 
GOGGCTGCTK CTCCCCANWT CCTCCCTAAC AATTCGTTGT 
GTTAGTGGCT TTTQCTTQGG ATCAGTQCTC TCTATTGATG 
CATTCCTGTT GCATTAAGAC TTGAAAGACT TGTAGATGTG 
CPGAAAGCTA TGTTACTATT CTTAGrTTTGT AAATTGTCCT 
TTCTTTTTGT AGGTATAAAT AAAAACACTG TTGACAATAA 
AAAAAAAAAA AAAAAAAAAA NAAAAAAAAA AAAAAAAAAA 



CTTCTTCCTA 
TCTGCGGGTC 
GTCTGCTGCT 
TATATTCCAA 
TCCTATCTTVG 
TTTGCCTAAG 
GGTGAAACTA 
CTTOGACTTCr 
TTCTTGCTGG 
TGATGTTCAG 
TTTGATACCA 
AAAAAAAAAA 



TACCAAAACT 
TTCAAGCATC 

ATCTCAAAAG 
CTTTAGAGTA 
AGCATGAGCA 
TTTAAGCWAT 
CATCTAAAAG 
TCTCCAGACA 
GCACAGQATG 

TCATCTTGrrr 

AAAAAAAAAA 
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(2) INFORMATION FOR SEQ ID NO: 226: 

(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 2057 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEE3NESS : .double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 226: 
CCGAGCCGGC TGCGCCGGGG GAATCCGTGC GGGCGCCTTC CGTCCCRGTC CCATCCTCGC 
CGCGCTCCAG CACCTCTGAA GTTTTGCAGC GCCCAGAAAG GAGGCGAGGA AGGAGGGAGT 
GTGTGAGAGG AGGGAGCAAA AAGCTCACCC TAAAACATTT ATTTCAAGGA GAAAAGAAAA 
AGGGGGGGCG CAAAAATGGC TGGGGCAATT ATAGAAAACA TGAGCACCAA GAAGCTGTGC 
ATTGTTGGTG GGATTCTGCT CGTGTTCCAA ATCATCGCCT TTCTGGTGGG AGGCTTGATT 
GCTCCAGGGC CCACAAOGGC AGTCTCCTAC ATGTCGGTGA AATGTGTGGA TGCCCGTAAG 
AACCATCACA AGACAAAATG GTTCGTGCCT TQGGGACCCA ATCATTGTGA CAAGATOCGA 
GACATTGAAG AQGCAATTCC AAGGGAAATT GAAGCCAATG ACATCGTOTT TTCTGTTCAC 
ATTCOCCTCC CCCACATGGA GATGAGTCCT TGGTTCCAAT TCATG^r^GTT TATCCTGCAG 
CTQGACATTG CCTTCAAGCT AAACAACCAA ATCAGRGAAA ATGCAGAAGT CTCCATOGAC 
GTTTCCCTGG CTTACCGTGA TGACGCGTTT GCT6AGTGGA CT6AAATGGC CCATGAAAGA 
GTACCACQGA AACTCAAATG CACCTTCACA TCTCCCAAGA CTCCAGAGCA TGGAGGGCOG 
GTTACTATGA ATGTGATGTC CTTarTTTCA TGGAAATTGG GTCTGTGGCC CATGAAGTTT 
TACdTTTAA ACATCCGGCT GCCTGTGAAT GAGAAGAAGA AAATCAATGT GGGAATTGGG 
GAGATAAflGG ATATCCGGTT GGTGQQGATC CACCAAAATG GAGGCTTCAC CAAGGTGTGG 
TTTGCCATGA AGACCTTCCT TACGCCCAGC ATCTrCATCA TTATGGTC?rG GTATTGGAGG 
AGGATCACCA TGATGTCCCG ACCCCCAGTG CTTCTGGAAA AAGTCATCTT TGCCCTTGGG 
ATTTCCATGA CCTITATCAA TATCCCAGTG GAATGGTTTT CCATOGGGTT TGACTGGAOC 
TGGATGCTGC TGTTTQGTGA CATCCGACAG GCATCTTCTA TGCRATGCTT CTKTCCTTCT 
GGATCATCTT CTGTGGCGAG CACATGATGG ATCAGCACGA GCGGAACCAC ATCGCAGGGT 
ATTGGAAGCA AGTOGGACCC ATTGCCGTTG GTCCTTCTGC CTCTTCATAT TTGACATGTG 
TGAGftGAQGG GTACAACTCA CGAATCCCTT CTACAGTATC TGGACTACAG ACATTGQGAA 
CAGAGCTGGC CATGGCTTTC ATCATCGTQG CTGGAATCTG CCTCTGOCTC TAACTTCCTG 
TTTCTATGCT TCATGGTATT TCAGGTGTTT CGGAACATCA GTGGGAAGCA GTCCAGCCTG 
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CC-.3Cn.=^2^-. ^OJ-J-JTVCC^ SC3GCTAC-.C T.-.TGPiGSGGC T;^JVrTTTTAG GTTCAAGTTC 1500 

CTCrtTGCTTA TC^rCTTC-GC rr3C3CTGrC ^-JTCL-JTI^^rCA TCTTCTTCAT CGTTAGTCAG 1560 

GTAAC3G?lAG GCCAr?3G3A AATGGGGC3G C7:C?CJ-J:rZC CCAAGTGAAC AGTGCXrrTTT 1620 

TC-ja-G3CL-.r C:^J?^3G--.rj TGI^J^JTCTG? ATGCCTTTGC TCTGATGTTC TTGTATGCAC 1680 
CAirCCCrtT.---. A.--AC7ATGG= a-i^JSACCAGT CC=u=JrSG?-iT GCAACTCCCA TGTAAATCGA . 1740 

GGaiPdSATTG TGCrmCTTT GTTrCGGA-r TTTATCAAGA A'TTGTTCAGC GCTTCGAAAT 1800 

ATTCCITCAr CA^OGACAAC G~-.GCr??CTG GTATTTSA:?? CAACAAGGCA ACACATGTTT 1860 

A'TCAGCinG CAI?TrGCA3T TTTCACAGrC ACArTG-Z^G TACTTC5TATA CGCACACAAA 1920 

TAOyrrCATT TA3CC?r?Ar CTCP-AAATG? TAAATATAAG GAAAAAAGCG TCAACAATAA 1980 

ATArrCTrrG A:=rATCGrCT TACTTCTCT? AAAAAAAAAA AAAAAAACTC GTGCCGAATT 2040 

CGGCACGAGC GGCAC3A 2057 



(2) i:3CH:^ArraN rOR SSC id MO: 227: 

(A) LaflGIH: 2034 base pairs 
(5) TfPE: n-ucleiz acid 

(C) SZPAI^irEaiESS: doiibie 

(D) TCPCICG/: lir-ear 

GGC^GA3GGC CATTTCCrGC AAAG?J3CCAA ACCZCCATTC CTCTGTGCCC CTCCTCTCCC 60 

ACCAA^TGCr rr^JTAAAAAT A^cfCTTCrT ACCGGAAATA ACTCTTCAIT TTTCACTCCT 120 

CCCTCCTAGG rCACACriTT G«AAAAAGA ATCTGCATCC TSGAAACCAG AAGAAAAATA 180 

TG;^GAC3GGG AAICArCGTG TG5.TOTGrGr SCTGCCTTCG GCTGAGTGTG TGGAGrrCCTG 240 

CTCAGGIGTT AGGrTAC^iGrG TGrTTGAXCG TGGrGGCITG AGGGGAACCG CTTGrTTCAGA 300 

GCIGZGiCTG CGGCTSCaCT GCaG?JGAA3C TGCCCTTSGC TGCTCGTAGC GCCGGGOCTT 360 

CTCrCCTCG? CATCJCCCAG A3CAGCCA3T GTCCGGG^JSG CaGAAGGTAC CGGGGCAGCT 420 

ACTCCSISGAC ZGTGCGGGCC TGCCTGGGCT GCCCCCTCCG CCGTQGGGCC CTGTTGCTGC 480 

TGrrCCATCTA TITCTACXAC TCCCTCCXAA ATGCGGTCGG CCCGCCCTTC ACTTGGATGC 540 

TTGCCCTCC7 GGGCCTTCTC SlAGGCftCTG AACATCCTCC TGGGCCtCAA GGGCCTQGCX 600 

CCAGCTGAGA G-GTGAAA^JV GGG^ATITCA ACGTGGCCCA TGGGCTQGCA 660 

TGGTCATA'rr ACATCGGATA rCTGCGGCTG ATCCTGCCAG AGCTCCAGGC CCGGATTCGA 720 

ACTTACAATC ASCA-TTACAA CPJVCCTGCTA CXXT-GCTGCAG TGAGCCAGCG GrTGTNATATT 780 
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CrcCTCCCAT TGGACTGTGG GGTGCCTGAT AACCTGAGTA TGGCTGACCC CAACATTCGC 840 

TTCCTGGATA AACTGCCCCA GCAGACCGGT GACCGTGCTG GCATCAAGGA. TCGGGTTTAC 900 

5 

AGCAACAGCA TCTAlXy^GCT TCTGGAGAAC QGGCAGOGGG CGGGCACCTG TGTCCTGGAG 960 

TACGCXIACCC CCTTGCAGAC TTTGrTTTGCC ATGTCACAAT ACAGTCAAGC TGGCTTTAGC 1020 

10 GGGGAGGATA QGCTTGAGCA GGCCAAACTC TTCTGCCGGA CACTTGAGGA CATCCTGGCA 1080 

GATCCCCCTG AGTCTCAGAA CAACTGCCGC CTCATTGCCT ACCAGGAACC TGCAGATGAC 1140 

AGCAGCTPCT CQCTGTCCCA GGAGGTTCTC OGGCACXnXSC QGCAGGAGGA AAAGGAAGAG 1200 

15 

GTrACTGTGG GCAGCTTGAA GACCTCAGCG GTOXCAGTA CCTCCACX3AT GTCCCAAGAG 1260 

CCTGAGCTCC TCATCAGTGG AATGGAAAAG CCCCTCCCTC TCCGCACGGA TTTCTCTTGA 1320 

20 GACCCAGGGT CACCAGGCXIA GAGCCTCCAG TGGTCTCCAA GCCTCTGGAC TGGGGGCTCT 1380 

CrrCACTPGGC TGAATCTCCA GCAGAGCTAT TTCCTTCCAC AC3GGGGCCTT GCAQGGAAGG 1440 

GTCCAGGACT TGACATCTTA AGATGCGTCT TGTCCCCTTG GGCCAGTCAT TTCCCCTCTC 1500 

25 

TGAGCXrrCGG TGrrCTTCAAC CTGTGAAATG QGATCATAAT CACTGCCTTA CCTCCCTCAC 1560 

GGTTGfTTGTG AGGACTGAGT GTGTGGAAGT TTTTCATAAA CTTTGGATC5C TAGTGTACTT 1620 

30 AGGGGGICTG CCAQGTGTCT TTCATGGGGC CTTCCAGACC CACTCOCTAC CCTTCTCCCC 1680 

TTCCTTTGCC CGGQGACGCC GAACTCTCTC AATGGTATCA ACAGQCTCCT TCGCXCTCTG 1740 

GCTCCTGGTC ATGTTCCATT ATTGGQGAGC CXX3«3CAGAA GAATGGAGAG GAGGAGGAGG 1800 

35 

CXGAGnTTGG GGTATTGAAT CCCCCGGCTC CCACCCTGCA GCATCAAGGT TGCTATGGAC 1860 

TCTCCTGCXX; GGCAACTCTT GCGTAATCAT GACTATCTCT AGGATTCTGG CAOCACTTCC 1920 

40 TTCCCTGGCC CXOTAAGCCT AGCTGrPGTAT CX3GCAOCCCC ACCCCACTAG AGTACTCXCT 1980 

CTCACTTGCG GTTTCCTTAT ACTCCACXXX: TTTCTCAACG GTCTTTTTTT AAAGCACATC 2040 

TCAGA3rrAAA AAAAAAAAAA AAAAAAAAAA AGGGGGGGCN CCNT 2084 



45 



50 



(2) INFORMATION FOR SEQ ID NO: 228 



(i) SEQUENCE CHARACTEEaSTICS : 

(A) IiEa^JGTK: 2143 base pairs 

(B) TYPE: luicleic acid 

(C) STRAN0EENE5S: double 
55 (D) TOPOUOGY: linear 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 228: 

TOGACCCACG CGTCCGGrTTG AATTCCTTGA CCTGCAAACA CATATTTATT AGCCTGACTC 60 

60 
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AAACAATGAA GCTATTAAAA CITCGGAGGA 
TCACCAACAC GCTTATTTTG GCAGTGGCAG 
5 TGAAGTTCAG AATAGTGACA TGTCAGTCGG 
TCTGGCGCTT GCTGnTCTCC ATGATCCTCT 
CAAACAACCA GAGGTTTGCC TTTTCACCAT 

10 

AGGAGCCTAT GCTGAAAGAA AGCTTTGAAG 
CCAATGGAAA TAGTAAAGTT AACAAAGCAC 
15 ATGTTCCTTC TTCTGrTGACA GATGTAGCAC 
GAATGATCAC ACACTTTGAA AGGTCCAAAA 
AAGATGGCTA CXIATCAGGGA AGAGATCAGC 

20 

GGATTAAAGG AAGCAATGAC ATCCTGATCT 
GAGAGGTGrrC AGAACAAAGA GAACATCTTA 
25 CTACGAGCTT CTTATTTACA ACACTGCTGC 
TTCATGCAAC TTAAGTGTGT TGrTTCCTGAA 
ACAAACTAAA AAGTTTAACX; TCTTCTAAAA 

30 

CCAGGAGCAA CTGCCTGTAA TTTTTATTT A 
TTGTTAACTA CCTTTCATTT TCCTGGGAAG 
35 AGAAAAAAGG GCCCTTCTGA GTTAAGGAGC 
AAAAAAAAGA GAAACTGTTA CAGTATGATT 
AATTTTGTTT ACAAATGGTG TATATTAAAG 

40 

AAATATTAGC TTAACTCTTT TGACATCTGC 
GGTGCACACT COGAAACTTT TAACTACTGT 
45 GTCCTTfiGGC AATGTTTTGT TTGCCTTTAT 
GCACCGTGCT AGAGGAACTG TAATQCTTCA 
CCTCCTGGCT TAATTTAAAC AGTTATTQCA 

50 

TCGTPCrrTA GGATQGACTG TTCTGOTATC 
ACATCACAAG GTGATGGGAT TCATTTGAAG 
55 AATTTTGCXTP TCCCAAGATT TTTGTTCTAC 
AAAAATTTAA CAAAATTAAT GTATTTTTCT 
TTCTGTCAAA CTCATGAAAA ATTTCTTTCT 

60 



481 

ACATTGTAAA ACTCTCTTTG TATCGGCATT 120 

CATCCATTGT GTTTATCATC TGGACAACCA 180 

ACTGGCGGGA GCTGTGGGTA GACGATGCCA 240 

TTGTCATCAT GGTTCTCTQG CGACCATCTG 300 

TGTCTGAGGA AGAGGAGGAG GATGAACAAA 360 

GAATGAAAAT GAGAAGTACC AAACAAGAAC 420 

AGGAAGATGA TTTGAAGTGG GTAGAAGAGA 480 

TTCCAGCCCT TCTGGATTCA GATGAGGAAC 540 

TGGAGTAAGG AATGGGAAGA TTTGCAGTTA 600 

ATCTGTGTCA GTCTTCTGTA CGGCTCCATG 660 

GTTCCTTGAT CTITGGGCAT TGGAGrTQGC 720 

CTGAAAACAA GTTCATAAGA TGAGAAAAAT 780 

CXXTTTTCCT CCCAGACTCT GACATQGATG 840 

CXTTCTCTAA TGTTTCATTT TTTAAATCTG 900 

GATTGTCATC AACACCATAA TATGTAATCT 960 

TTTAGGGAGT TACATAGGTG ATOQGGGAAA 1020 

TCAAQGrrAC ATCTTGCAGA GGTTGTTTTG 1080 

CATAGTTCTA TCAATGATCA AAAGAAAAAA 1140 

CAGATCATTT AAAAAAGCAA AATCAAGTGC 1200 

ATTTTTCTAT TTCAGATGTA CTTTAAftGAG 1260 

TATTGTGACA CATCCCATTG CTGGCAATGT 1320 

TTTGTAAGCX: TCCAAGGGTG GCATTGCAGG 1380 

GCAGAGAGGT GCTCCAAGTC CTGTGATTGA 1440 

GAAGTTGTAG CTTATACAAA GGAAACAGGT 1500 

TGAAGTAGCG TOGAGGCXXTT GGACTGCTGC 1560 

TGGTATrGGT TTAGAGACTG TTAATAAGGG 1620 

CACTCTATTr CTGrTTTTAAT GGTnTATCC 1680 

ATAAAAAGTT CATGCXIACTT TTTAATATAA 1740 

CATTTTTTTC AAACTTTTTC TAAAGACTCT 1800 

ATGGCTTTTA TTCTAGATTG TCTTATTTTC 1860 
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10 



15 



25 



55 



TGrrTAAAACC AATGACCACA TGACCACAAT CTTCACTAAC TCATACTGCA GTGAAAGTOT 1920 

TAACCCTTAG GTAGTTTCTC TACAACTCTT TGCTATGGTG ATTTTTAAAA AAGTrTCCTA 1980 

GGGAAGTATC TCTGAGQGAA CAGGCAATCT GAAGGAACTG flCTATATTCT CX:ATQGCTAA 2040 

GTCCATTAGG CCAAAAGNCT GGCTTGGGTAT TGGTTGTCAN GCTGTCTATT GGCATATTAA 2100 

AAACGTAGGC CGGANQGAAT AATTAGGTTG TNATGCXX3GC GGG 2143 



(2) INFORMftTION FOR SEQ ID NO: 229 



(i) SEQUENCE CHARACTERISTICS: 

(A) LE^3C?^H: 1025 base pcujrs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 
20 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 229: 

CCTGGCCCAC ATTGCTTCAT TGGCCTC3GCC ATGCGCCTGT ACTATGGCAG CCGCTAGTCC 60 

CTGACAACTT CCACCCTGAT TCCGGACCCT GTAGATTGGG CGCCACCACC AGATCCCCCT 120 

CCCAGGCCTT CCTCCCTCTC CCATCAGCAG CCCTGTAACA AGTGCCTTGT GAGAAAAGCT 180 

30 GGAGAAGTGA GGGCAGCCAG GTTATTCTCT GGAQGTTQGT GGATGAAGGG GTACCCTAQG 240 

AGATGTGAAG TGTGGGTTTG GTTAAGGAAA TGCTTACCAT CCCCCACCCC CAACCAAGTT 300 

CrrcCAGACT AAAGAATTAA GGTAACATCA ATACCTAGGC CTGAGAAATA ACCCCATCCT 360 

35 

TGTTC3GQCAG CTCCCTGCTT TGTCCTGCAT GAACAGACTT GATGAAAGTG GQGTGTGGGC 420 

AACAAGTQGC TTTCCTTGCC TACTTTAGTC ACCCAGCAGA GCCACTQGAG CTGGCTAGTC 480 

40 CAGCCCflGCC ATGGTGCATG ACTCTTCCAT AAGGGATCCT CACCCTTCCA CTTTCATGCA 540 

AGAAOGCCCA GTTGCCACAG ATTATACAAC CATTACCCAA ACCACTCTGA CAGTCTOCTC 600 

CAGTTCCAGC AAIGCCTftGA GACATGCTCC CTGCCCTCTC CACAGTGCTG CTCCCCACAC 660 

45 

CTAGCCTTTG TTCTGGAAAC CCCAGAGAGG GCTQQGCTTG ACTCATCTCA GGGAATGTAG 720 

CCCCrOGGCC CTQGCTTAAG COSACACTCC TGACCTCTCr GTTCACCCTG AGGGCTGTCT 780 

50 TGAAGCCCGC TACCCACTCT GfiGGCTCCTA GGAGGTACCA TGCTTCCCAC TCTQGQGCCT 840 

GCCCCTGCCT AGCAGTCTCC CAGCTCCCAA CAGCCTGGGG AAGCTCTC3CA CAGAGTGACC 900 

TGAGACCAGG TACAGGAAAC CTGTAGCTCA ATCAGTGTCT CTTTAACTGC ATAAGCAATA 960 

AGATCTTAAT AAAGTCTTCT AGGCTGTAGG GrrGGTTCCTA CAACCACAGC CAAAAAAAAA 1020 

AAAAA 1025 



60 
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(2) INFORMATrON FOR SEQ ID NO: 230: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGrrH: 1250 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 230: 
GCCCACGCGT CCGCCCACGC GTCOGGCGGT GCGGA(3TATG GGGCGCTGAT GGCCATQGAG 
GGCTACTOGC GCTTCCTGGC GCYGCTGGGG TCGGCACTGC TCGTXiXSGCTr CCTGTCGC3TG 
ATSTTCGCCC TOGTCTGGGT CCTCCACTAC CGAGAGGGGC TTGGCTGQGA TGGGAGCGCA 
CTAGAGTTTA ACTGGCACCC AGTGCTSATG GTCACCQGCT TOGTCTTCAT CCAGGGCATC 
GCATCATCGT CTACAGACTG CCGTGGACCT GGAAATGCAG CAAGCTCCTG ATGAAATCCA 
TCCATGCAGG GTTAAATGCA GTTGCTGCCA TTCTTCCAAT TATCTCTC5TG GTGGCCGTGT 
TTGAGAACCA CAATGTTAAC AATATAGCCA ATATCTACAG TCTGCACAGC TGGGTTGGAC 
TCATAGCTGT CATATGCTAT TTGTTACAGC TTCTTTCAGG TTTTTCAGTC TTTCTGCTTC 
CATGGGCTCC GCTTTCTCTC CGAGCATTTC TCATQCCCAT ACATGTTTAT TCTC5GAATTG 
TCATCTTTGG AACAGTGATT GCAACAOCAC TTATGGGATT GACAGAGAAA CTGATTTTTT 
CCCTGAGAGA TCCTGCATAC AGTACATTCC OGCCAGAAGG TGTTTTCGTA AATACGCTTG 
GCCTTCTGAT CCTGGTGrTTC GGGGCCCTCA TTTTTTGGAT AGTCACCAGA CCGCAATGGA 
AACGTCCTAA GGAGCCAAAT TCTACCATTC TTCATCCAAA TGGAGGCACT GAACAGQGAG 
CAAGAGGTTC CATQCCAGCC TACTCTGGCA ACAACATGGA CAAATCAGAT TCAGAGTTAA 
ACARTCAAGT AGCAGCAAGG AAAAGAAACT TAGCTCTOGA TGAQGCTGGG CAGAGATCTA 
CCATGTAAAA TGTTGTAGAG ATAGAGCCAT ATAACGTCAC GTTTCAAAAC TAGCTCTACA 
GmTQCTTC TCCTATTAGC CATATGATAA TTQGGCTATG TAGTATCAAT ATTTACTTTA 
ATCACAAAGG ATGGmTCTT GAAATAATTT GTATTGATTG AGGCCTATGA ACTGACCTGA 
AITQGAAAGG ATGTGATTAA TATAAATAAT AGCAGATATA AATTGTGGTT ATGTTACCTT 
TATCTTGTTG AGGACCACAA CATTAGCACG GTGCCTTGTG CAKAATAGAT ACTCAATATG 
TGAATATGTG TCTACTAGTA GTTAATTQGA TAAACTGGCA GCATCCCTGA 

(2) INFORMATION FOR SEQ ID NO: 231: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 1811 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 231: 
OOICAOTAC CGGTCNGATT CCCGGGTCGA CCCACGCGTC CGCTGCATTC CAGGGCdTT 
10 CACTTCGCnT CATTCTGAAG TTCCTGGATA ACATCrPCCA TGTCTTGATG GCCCAGGTTA 
CCASTCTCAT TATCACAACA GTGTCTCJrCC TGGTCTTTGA CTTCAGGCCC TCCCTGGAAT 
TTTTCTTOGA AGCCSCATCA GTCSTYCTCT CTATATTTAT TTATAATGCC AGCAAGCCTC 

15 

AAGTTCCGGA ATAOGCACCT AGGCAAGAAA QGATCCGAGA TCTAAGTGGC AATCTTTGGG 
ACSCGTTCCAG TGGGGATGGA GAAGAACTAG AAAGftCTTAC CAAACCCAAG AGTGATGAGT 
20 CAGA1X3AAGA TACTTTCTAA CTGGTACCCA CATAGTrTGC AGCTCTCTTG AACCTTATTT 
TCACATTTTC AGTGTTTGTA ATATTTATCT TTTCACTTTG ATAAACCAGA AATGTTTCTA 
AATCCTAATA TTCTTTGCAT ATATCTAGCT ACTCCCTAAA TGGTICCATC CAAGGCTTAG 
AGTACCCAAA GGCTAAGAAA TTCTAAAGAA CTGATACAGG AGTAACAATA TGAAGAATTC 
ATTAATATCr CAGTACTTGA TAAATCAGAA AGTTATATGT GCAGATTATT TrCCTTGGCC 
TTCAAQCTTC CAAAAAACTT C3TAATAATCA TGTTAGCTAT AGCTTGTATA TACACATAGA 
GATCAATTTG CCAAATATTC ACAATCATGT AGTTCTAGrTT TACATGCCAA AGTCTTCCCT 
TTTTAACATT ATAAAAGCTA GGrTGTCTCT TGAATTTTGA QGCCCTAGAG ATAGTCATTT 
TGCAflGTAAA GAGCAACGGG ACCCTTTCTA AAAACGTTGG TTGAAQGACC TAAATACCTG 
GCCATACCAT AGATTTGGGA TGATGTAGTC TGTQCTAAAT ATTTTGCrGA AGAAGCflGTT 
40 TCTCAGACAC AACATCTCAG AATTTTAATT TTTAGAAATT CATGGGAAAT TGGATTrTTG 
TAATAATCTT TTGATGTTTT AAACATTGGT TCCCTAGTCA CCATAGTTAC CACTTGTATr 
TTAAGTCATT TAAACAAGCC ACGGTGGGGC TTTTTTCTCC TCACTTTTGAG GAGAAAAATC 

45 

TTCATOTCAT TACTCCTGAA TTATTACATT TTGGAGAATA AGAGGGCATT TTATTTTATT 
AGTTACTAAT TCAAGCTGTG ACTATTGTAT ATCTTTCCAA GAGTTGAAAT GCTGGCTTCA 
50 GAATCATACC AGATTGTCAG TGAAGCTGAT GCCTAGGAAC TTTTAAAGGG ATCCTTTCAA 
AAGGATCACT TAGCAAACAC ATGTTGACTT TTAACTGATG TATGAATATT AATACTCTAA 
AAATAGAAAG ACCAGTAATA TATAAGTCAC ITTACAGrTGC TACTTCACAC TTAAAAGTQC 

55 

ATCGTATTTT TCATGGTATT TTGCATGCAG CCAGTTAACT CTCGTAGATA GAGAAGTCAG 
GTCATAGATG ATATTAAAAA TTAGCAAACA AAAGTGACTT GCTCAGGGTC ATGCAGCTGG 
60 GTCATCATAG AAGAGTGQGC TTTAACTGQC AGGCCTGTAT GTTTACAGAC TACCATACTG 



30 



35 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
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TAAATATCAG CTTTATGGTG TCATTCTCAG AAACTTATAC ATTTCTGCTC TCCTTTCTCC 
TAAGTTTCAT GCAGATGAAT ATAAGGTAAT ATACTATTAT ATAATTCATT TGTGATATCC 
ACAATAATAT GACTGGCAAG AATTGGTGGA AATTTGTAAT TAAAATAATT ATTAAACCTA 
AAAAAAAAAM N 



60 



(2) INFORMATION FOR SEQ ID NO: 232: 

(i) SEQUENCE OIARACTERISTICS : 

(A) LENGTH: 2271 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : doiable 

(D) TOPOLOGY: linear 

(xi) SE5CPENCE DESCRIPTIGW: SEQ ID NO: 232: 
CTCACCTCAT GGCGTAGAGC CTAGCAACAG GGCAGGCTCC CAGCCGAGTC CGTTATGQCC 
GCTGCCGTCC CGAAGAGGAT GAGGC3QGCCA GCACAAGCGA AACTGCTGCC CGGGTCQGCC 
ATCCAAGCCC TTGTGGGGTT GGCGCQGCXX; CTGGTCTTGG CX5CTCCTGCT TGTGTCCGCC 
GCTCTA'TCCA GTGTTGTATC AOQGACTGAT TCACCX5AGCC CAACCGTACT CAACTCACAT 
ATTTCTACCX: CAAATGTGAA TGCTTTAACA CATGAAAACC AAACCAAACC TTCTATTTCC 
CAAATCAGCA CCACCCTCCC TCCCACGACG AGTACCAAGA AAACTGGAGG AGCATCTGTG 
GTCCCTCATC CCTCX3CCTAC TCCTCTGTCT CAAGAGGAAG CTGATAACAA TGAAGATCCT 
AGTATAGAGG AGGAGGATCT TCTeATQCTG AACAGTTCTC CATCCACAGC CAAAGACACT 
CTAGACAATG GCGATTATGG AGAACCAGAC TATGACTGGA CCACGGGCCC CAGGGACGAC 
GACGAGTCTC ATNGACACCT TGGAAGAAAA CaGGQGTTAC ATGGAAATTG AACAGTCAGT 
GAAATCTTTT AAGATGCCAT OCTCAAATAT AGAAGAQGAA GACAGCCAIT TCTTTTTTCA 
TCETATTATT ri ' itJL ' mTr GCATPGCTOr TGrTTTACATT ACATATCACA ACAAAAGGAA 
GATTTTTCTT CTQGTTCAAA GCAGGAAATG GOGTGATGGC CTTTGTTCCA AAACAGrTGGA 
ATACCATCGC CTAGATCAGA ATCTITAATGA GGCAATGCCT TCTTTGAAGA TTAOCAATGA 
TTATATTTTT TAAAGCACTG TGATTTGAAT TTQCTTATGT AATTTTATTT GCTTGACTTT 
TTATATGATA TICTGCAAAT GrTTTGCCATA GGCAATTGGT ACTTAAATGA GAGCTGAGTC 
TCTCTTTTGC CTTGGTGCTT TQGAAATTAA ATGTCACAAA CGAGTATATA ATTTTTTATC 
TGTACTTTTA GAGCTGAGTT TAATCAGGTG TCCAAAATGT GACTTTAAACA TTACCTTATA 
TTTACACTGT TAGTmTAT TGTTTTAGAT TTATTATGCT TCTTCTGGAA GTATTACTGA 



1680 
1740 
1800 
1811 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
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TGCTACTTTT AAAAGATCCC AAACTTGTAA CTAAATTCTG ACATATCTGT TACTGCTGAC 
TCACATTCAT TCTCTGCCAT TCAAATACTA TTTTTTATCC ACATTTTTTT TTCTTCCCAA 
ACTOTAATCT ACAAGGATAT GTGTCATAAT GCTTTGGATT TGAGTAATAT TTTTTTTTCT 
TCCAAGAAAA CTGCnTGGA TATTrTTAGA TAATTTAAAC ATAATTTAGG ATAATCATAT 
TGCTCAATCT GACCACAATT TTAGGTAAAA CATTAAATGT GTCAAGAAAT CTTQGCAACA 
GftGACTCTGC AGCTTGCAGT GGACATAGAT AAAATGTTAC AGAGATACTA TTTTTTTGGT 
TGGAATTACT ATATTAAATT TAGAAGCAGA AACTGGTAAA ATGTTAAATA CATGTACAAT 
TGCTTTTACT TAGCAATTGA TTGTrAGCATG GGTTCCTCCA AGGTTTCAAG CAATGGGCAG 
ACTTTAAAAT TATATCAGAT TCGTrTACTT CGTTTATTAT TTTACAGTAA ATTTGAATAA 
ATCTTAGGGG TCATTATCAC TTAAATAATA CTGTACCTAG GTCTTTCAAA TTAAAATTAT 
ACCTGAATGA AGTTGTTTCT ATACATAAAG GATAmCTG TACAATTACC TTTTTTCCCC 
CACACTTOTT ri l.T ri V iTr TTGTrTTTTA TGGCAACTGG AAAGTATTTA CTATGGGATT 
CATTTATCTC TGrrCTTTCTA TCATAAAGAA TTGATCAATA TGTAAATATG TGATTTGAAC 
CAIGCTTGAC TTACAAGTGT CACTACAGCT TTTTAGAAAA CATAGCCCTA ATATATGTTA 
AGCAGGACCX: GGCrTGAGCCA GTGGGCTTGC GCTTTATGTA GAGCTGGAAG AAGGCCGTCC 
ATCCTCTCTC TTGGGCXX3AC AGTGrrACTTT CCTAATAGGG AAGGGAAGCA CAATGQAAAT 
ACCCCTGAAC CGmTATTG CAGTAATTTT TTTCATATCT GAAACTATTA TTTAATATTT 
TGAATAAGAT TTTAAAAAAT AAATGGCAAA GATATAAATC TAAAAAAAAA AAAAAAAAAA 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAANANA N 



1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 
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1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2271 
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(2) INFORMATION FOR SEQ ID NO: 233: 

(i) SE3QUENCE CHARACTERISTICS: 

(A) LQJGTH: 1338 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: double . 

(D) TOPOIjOGY: linear 

(xi) SBQUEtJCE DESCRIPTION: SEQ ID NO: 233: 

cttccggttc tccggqcagc tgccactgct gtagcttctg ccacctgcca cgaccgggcc 
TCTCCCTCGC gtttggtcac ctctgcttca ttctccaccg cgcctatggt ccctcttgga 
gccagcgtgg cgngcciggc gqctccoggg tqgtgagaga gcggtccggg aacgatgaag 
gcctcgcagt gctgctgctg tctcagccac cpcttggctt ccgtcctcct cctgctgttg 
ctgcctcaac taagcgggyc cctggmagtc ctgctgcagg cagccgaggc cgcgccaggt 



60 
120 
180 
240 
300 
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YTTCGGCCTC CTCACCCTAG ACCAGGACAT TACCGCCXXTT GCCACCGGGC CXTTWACCCCT 360 

GCCCAGCAGC CGGGCCGTGG TCTGGCTGAA GCTGCGGGGG CCGCGGGGCT CCGAGGGAGG 420 

5 

CAATCGCAGC AACCCTCTGG CCGGGCTTGA GACGGACGAT CACGGAGGGA AGGCCGGGGA 480 

ARGCTCGGTG GGTQGCGGCC TTGCTGTGAG CCCCAACCCT GGCGACAAGC CTATGACCCA 540 

10 GCGGGCCCTG ACCGTGTTGA TGGTGGTGAG CGGCX^CGGTG CTGGTGrTACT TCGTGGTCAG 600 

GACGGTCAGG ATGAGAAGAA GAAACCGAAA GACTAGGAGA TATGGAGTTT TC3GACACTAA 660 

CATAGAAAAT ATCGAATTGA CACCTTTAGA ACAGGATGAT GAGGATGATG ACAACACGTT 720 

15 

GTTTCATCCC AATCATCCTC GAAGATAAGA ATGTGCCTTT TGATGAAAGA ACriTATCTT 780 

TCTACAATGA AGAGTGGAAT TTCTATGrTTT AAGGAATAAG AAGCX:ACTAT ATCAATGITG 840 

20 GGGGGGTATT TAAGTTACAT ATATTTNAAC AACCTTTAAT TTGCTGTTGC AATAAATACC 900 

GTATCCTTTT ATTATATCrT TATATGTATA GAAGrTACTCT GTTAATQGGC ICAGAGATGT 960 

TGGGGATAAA GTATACrGTA ATAATTTATC TGTTTGAAAA TTACTATAAA ACGGrrGTTTT 1020 

25 

CTCKICGGrrr TTTGTTTCCT GCTTACCATA TCATTGTAAA TrGTTTTATG TATTAATCAG 1080 

TTAATGCTAA TTATTTTTGC TGATGTCATA TGTTAAAGAG CTATAAATTC CAACAACCAA 1140 

30 CTGGTGTGTA AAAATAATTT AAAATYTCCT TTACTGAAAG GTATTTXXCA TTTTTGTQGG 1200 

GAAAA6AAGC CAAATTTATT ACTTTGrTGTT GGGGTTTTTA AAATATTAAG AAATGTCTAA 1260 

GTTATTGTTT GCAAAACAAT AAATATGATT TTAAATTCTC TTAAAAAAAA AAAAAAAAAC 1320 

35 

CCCGGGGGGG GGCCCGC3N 1338 



40 

(2) INFORMATICN FOR SEQ ID NO: 234: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LEMGrTH: 31 amino acids 
45 (B) TYPE: amino acid 

(D) TOPOUOGY: linear 
(xi) SEQUENCE DESCRIPTICN: SEQ ID NO: 234: 

Met Leu Ser Thr Gly lie Glu Val Ala Arg Pro Pro Ala Thr Leu Leu 
50 1 5 10 15 

Gly Leu Met Phe Val Leu Thr Gly Met Pro Arg Gly Leu Arg Xaa 
20 25 30 

55 

(2) INFORMATION FOR SEQ ID NO: 235: 

(i) SEQUENCE CHARACTERISTICS: 
60 (A) LENGTH: 116 amino acids 
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(S) TYPE: amino acid 
(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 235: 

5 Met Asn Val Val He Val He He Leu Phe Ser Phe Asp Ser Val Gly 
15 10 15 

Thr Met Phe Ser Cys Asn Arg He Pro Lys He Tiir Val Leu Asn Lys 
20 25 30 

10 

Leu Lys Phe Xaa Cys Glu Val Leu Leu Arg He Gin Thr He Gin Gly 
35 40 45 

Phe Tyr Arg Cys Thr Arg He Ser Arg Tyr Lys Gly He Phe Pro Asp 
15 50 55 60 

Phe Cys Gin Ser Gin Cys Met Gly Cys Asn Pro Glu Ser Xaa Met Ala 
65 70 75 80 

20 Val Pro Ala Leu Val Thr Pro He Leu Ala His Arg Lys Lys Glu Lys 

85 90 95 

Gly Met Cys Leu Phe Thr Leu He He Ala Pro Thr Arg Cys Thr His 
100 105 110 

25 

Tyr Phe Cys Xaa 
115 



30 

(2) INFORMATION FOR SEQ ID NO: 236: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 103 amino acids 
35 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCEIPTION: SEQ ID NO: 236: 

Met Ser Ser Ala Lys He Val Arg Gin Arg Gly Ala Val Pro Thr Tyr 
40 1 5 10 15 

Tyr Thr Thr Glu Ala Gly Glu He He Phe Leu Val Leu Asn Trp Ser 
20 25 30 

45 Leu Ser He Leu His He Val Asp Val Leu Cys Ser Lys Pro Glu Lys 
35 40 45 

Ser Val Thr Glu Asp Ala Ala Ser .Gly Leu Ser Gin Arg Met Thr Ala 
50 55 60 

50 

Leu Val Trp Arg Lys Gly Pro Asp Gly Gly Ser Arg Lys Pro He Leu 
65 70 75 80 

Leu Leu Phe Phe Phe Leu Pro Leu He Leu Cys Phe His Ser Phe He 
55 85 90 95 

His Ser Ser Asn He Cys Xaa 
100 
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(2) INFORMATION FOR SEQ ID NO: 237: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LJQJGTH: 42 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SE(^JENCE DESCRIPTION: SEQ ID NO: 237: 

Met He Leu Phe Pro Gin Xaa Ala Leu Arg Leu Gly Xaa Trp Pro Arg 
15 10 15 

Thr Trp Ser He Leu Xaa Lys Tyr Ser Val Asn Phe Phe Ser Ala Tyr 
20 25 30 

Ser Pro Met Gly Ala Val Gly Thr Glu Phe 
35 40 



(2) INFORMATION FOR SEQ ID NO: 238: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 238: 

Met He He Leu Leu Leu Phe Met Leu Leu Asn Asn VclL Val Leu Val 
15 10 15 

Gin Glu Asp Asn Cys Gin Arg Lys Asn Thr Val Gin Glu Arg Ai^ Xaa 
20 25 30 

Trp Ser Gin Trp Xaa 
35 



(2) INFORMATICN FOR SEQ ID NO: 239: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 128 amino acids 

(B) TYPE: amino acid 
(D) TOPOIjOGY: lineeu: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 239: 

Met Ala Ala Xaa Pro Pro Gly Cys Thr Pro Pro Xaa Leu Leu Asp He 
15 10 15 

Ser Trp Leu Thr Glu Ser Leu Gly Ala Gly Gin Pro Val Pro Val Glu 
20 25 30 

Cys Arg His Arg Leu Glu Val Ala Gly Pro Arg Lys Gly Pro Leu Ser 
35 40 45 

Pro Ala Trp Met Pro Ala Tyr Ala Cys Gin Arg Pro Thr Pro Leu Thr 
50 55 60 

His His Asn Thr Gly Leu Ser Glu Leu Leu Glu His Gly Val Cys Glu 
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65 



70 



75 



80 



Glu Val Glu Arg Val Arg Arg Ser Glu Arg Tyr Gin Thr Met Lys Val 
85 90 95 

Arg Arg Ala Gly Leu Gly Pro Thr Pro Gly Met Ser Cys Pro Gly Asn 
100 105 110 



Asp Asn Thr Val His Thr Met His Gly Glu Ala Asn Arg Gly Ser Xaa 
10 115 120 125 



15 



30 



40 



55 



(2) INFORMATION FOR SEQ ID NO: 240 



(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 67 amino acids 

(B) TYPE: amino acid 

(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 240: 

25 Met Ser lie Leu Cys Cys Pro Xaa Leu Cys Leu Phe Phe Ser Phe Cys 
15 10 15 

lie Ser Ser Gly Ser Cys Pro Phe Ser His Val Ser Gin Leu Ser Phe 
20 25 30 



lie Ala Thr Phe Ser Gin Ser Ser Pro Val Leu Leu Val Pro Ala Tyr 
35 40 45 



Asn Thr Tyr Leu Ser Phe Leu Ala Phe Leu Asp Cys Ala Ser Leu Thr 
35 so 55 60 



Ser Thr Xaa 
65 



(2) INFORMATION FOR SEQ ID NO: 241: 



(i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 69 amino acids 

(B) TYPE: amino acid 
<D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 241: 

50 Met Ser Thr Phe Gin Leu Leu Leu Leu He Leu Ala Gin Ser Thr Tyr 
15 10 15 

Lys He Lys Ser Lys Pro Leu His Met Thr Asn His Thr Leu Leu Asn 
20 25 30 



Ser Pro Gly Leu Asn Pro Ser Ser Pro Thr Leu Asn Phe Lys Thr Gin 
35 40 45 



Gin His Glu Ser Val Ser Tyr Ala Cys Cys His Met Arg Ser Leu His 
60 50 55 60 
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His Ala Phe Ala Xaa 
65 



(2) INFORMATION FOR SEQ ID NO: 242: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LQTCTH: 44 amino acids 

(B) TYPE: amino acid 
(D) TOPOIjOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 242: 

Met Val Ser Val Val Leu He Phe Ser Phe Leu Ser Leu Thr He Ser 
15 10 15 

Thr Thr Ala Ser Ala Tyr Asn Gly Asn Asp Thr Gin Gly Trp Asn Asp 
20 25 30 

Lys Phe His Xaa Xaa Ser Val Lys Thr Gin Thr Xaa 
35 40 



(2) INFORMATION FOR SEQ ID NO: 243: 

(i) SEQUENCE CHARACTCRISTICS : 

(A) LQJGTH: 51 amino acids 

(B) TTPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 243: 

Met He Ser Asp Ala Gly Ala Gly Phe Gly Val Phe Leu Leu Val Pro 
15 10 15 

Arg Ala Gly His Cys Trp Gly Ala Gly Lys Pro Leu Pro Ser Cys Pro 
20 25 30 

Ser Val Ala Ser He Pro Ser Trp Val Leu Pro Ser Phe Leu Glu Arg 
35 40 45 

Gly Arg Xaa 
50 



(2) INFORMATIC»J FOR SEQ ID NO: 244: 

(i) SBt^JENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: aniino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 244: 

Met Val Gin Thr He Gin Asp Phe Leu Ser Leu Phe Ser Thr Pro He 
15 10 15 



Phe Leu Leu Leu Leu Met Phe Glu Thr Leu Ser Leu Ala Pro Ala Trp 
20 25 30 
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Leu Lys Pro Leu Arg "=l1 rhr 3er His Ser Zaa 
35 4C 



(2) INFORMATIOJ rCR SE;I ZD r^: : i45: 

(i) SEGL-H^iCE C-^^-:?XrrE?-— C£: 

(A) LrZ3:=rH: 51 a=Linc acids 
(3) TVrZ: 2mir.3 acid 
(D) TC?CLCG"f: linear 

(xi) ss^uh:3ce iescriptz::?:: szq :;c: 245: 

Met lie Leu Me:: Pro Gl'*- l^eu 1-1 v rnr Ser .^jrg Gin Arg Ser Val Pro 
1 5 IC 15 

Phe Val Pro Thr Leu ?-st. Ala Ser Pro Gly Ala !tet Thr Gly Pro 
20 25 30 

Thr Ala Thr Leu Thr Ser IVs 2—- 1*br Thr Ala Cys Arg Val Ser 
35 41 45 

Trp Ala Asn Gly Trp Thr Ser 1^- A^g Thr Phe Arg Zaa 
50 55 50 



(2) INFORMATICS FCR SE^ ZD ITZi 24o: 

(i) SECL"H:K3 G^^ACTZrirSTrCS: 

(A) LH:z?rH: 3€ acir^ acids 
(3) TV- 
CD) 

(xi) SEQUH:X3 3E5CHI?rr3<: SHQ ID 245: 

Met Ser His His Ala Q'—, P3K> Arg Phe Leu Leu lie Tbr 1<bz Leu Leu 
15 10 15 

Gin Glu Ala Lys Pro Val Ser Asr. lie Pro His Leu Leu Glu Ser Trp 
20 2= 30 

Tyr Phe Gly Xaa 
35 



(2) INFORMATICN FCR SEQ ID >C: 247: 

(i) SEQUENCE CHAHACIZHISTICS : 

(A) LETZ^IH: 3= 3=ir^ acids 

(B) TfcZz amino acid 
(D) TCPCIOGY: 1: 



(xi> SSaUENCE lESCRIPn^I: SEQ ID !30: 247: 

Met Asn Ser Leu Pbe Trp Met lie Leu Leu Pro Val Ser Gin Asp Gin 
1 5 * 10 15 

Val Val Glu Gly Leu Gin Gly T-ly Phe Ser Gin lie His Met Arg He 
20 25 30 



wo 98/54963 



PCT/US98/11422 



493 



Leu Arg Lys His Leu Xaa 
35 

5 

(2) INFORMATION FOR SEQ ID NO: 248: 

(i) SBQUHJCE CHARACTERISTICS: 
10 (A) LENGTH: 211 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTICN: SEQ ID NO: 248: 

15 Met Ser Arg Ser Xaa Asp Val Thr Asn Thr Thr Phe Leu Leu Met Ala 
1 5 10 15 

Ala Ser lie Tyr Leu His Asp Gin Asn Pro Asp Ala Ala Leu Arg Ala 
20 25 30 

20 

Leu His Gin Gly Asp Ser Leu Glu Cys Thr Ala Met Thr Val Gin He 
35 40 45 

Leu Leu Lys Leu Asp Arg Leu Asp Leu Ala Arg Lys Glu Leu Lys Arg 
25 50 55 60 

Met Gin Asp Leu Asp Glu Asp Ala Thr Leu Thr Gin Leu Ala Thr Ala 
65 70 75 80 

30 Trp Val Ser Leu Ala Thr Gly Gly Glu Lys Leu Gin Asp Ala Tyr Tyr 

85 90 95 

He Phe Gin Glu Met Ala Asp Lys Cys Ser Pro Thr Leu Leu Leu Leu 
100 105 110 

35 

Asn Gly Gin Ala Ala Cys His Met Ala Gin Gly Arg Trp Glu Ala Ala 
115 120 125 

Glu Gly Leu Leu Gin Glu Ala Leu Asp Lys Asp Ser Gly Tyr Pro Glu 
40 130 135 140 

Thr Leu Val Asn Leu He Val Leu Ser Gin His Leu Gly Lys Pro Pro 
145 150 155 160 

45 Glu Val Thr Asn Arg Tyr Leu Ser Gin Leu Lys Asp Ala His Arg Ser 

165 170 175 

His Pro Phe He Lys Glu Tyr Gin Ala Lys Glu Asn Asp Phe Asp Arg 
180 185 190 

50 

Leu Val Leu Gin Tyr Ala Pro Ser Ala Glu Ala Gly Pro Glu Leu Ser 
195 200 205 

Gly Pro Xaa 
55 210 



60 



(2) INFORMATION FOR S^ ID NO: 249: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LQCTH: 548 amino acids 

(B) TYPE: amino acid 
(D) TOPOU)GY: linear 

5 (xi) SEQUENCE DESCRIPTION: SBQ ID NO: 249: 

Met Glu Asp Ser Glu Ala Leu Gly Phe Glu His Met Gly Leu Asp Pro 
15 10 15 

10 Arg Leu Leu Gin Ala Val Thr Asp Leu Gly Trp Ser Arg Pro Thr Leu 
20 25 30 

He Gin Glu Lys Ala He Pro Leu Ala Leu Glu Gly Lys Asp Leu Leu 
35 40 . 45 

15 

Ala Arg Ala Arg Thr Gly Ser Gly Lys Thr Ala Ala Tyr Ala lie Pro 
50 55 60 

Met Leu Gin Leu Leu Leu His Arg Lys Ala Thr Gly Pro Val Val Glu 
20 65 70 75 80 

Gin Ala Val Arg Gly Leu Val Leu Val Pro Thr Lys Glu Leu Ala Arg 
85 90 95 

25 Gin Ala Gin Ser Met He Gin Gin Leu Ala Thr Tyr Cys Ala Arg Asp 
100 105 110 

Val Arg Val Ala Asn Val Ser Ala Ala Glu Asp Ser Val Ser Gin Arg 
115 120 125 

30 

Ala Val Leu Met Glu Lys Pro Asp Val Val Val Gly Thr Pro Ser Arg 
130 135 140 

He Leu Ser His Leu Gin Gin Asp Ser Leu Lys Leu Arg Asp Ser Leu 
35 145 150 155 160 

Glu Leu Leu Val Val Asp Glu Ala Asp Leu Leu Phe Ser Phe Gly Phe 
165 ^ 170 175 

40 Glu Glu Glu Leu Lys Ser Leu Leu Cys His Leu Pro Arg He Tyr Gin 
180 185 190 

Ala Phe Leu Met Ser Ala Thr Phe Asn Glu Asp Val Gin Ala Leu Lys 
195 200 205 

45 

Glu Leu He Leu His Asn Pro Val Thr Leu Lys Leu Gin Glu Ser Gin 
210 215 220 

Leu Pro Gly Pro Asp Gin Leu Gin Gin Phe Gin Val Val Cys Glu Thr 
50 225 230 235 240 

Glu Glu Asp Lys Phe Leu Leu Leu Tyr Ala Leu Leu Lys Leu Ser Leu 
245 250 255 

55 He Arg Gly Lys Ser Leu Leu Phe Val Asn Thr Leu Glu Arg Ser Tyr 
260 265 270 

Arg Leu Arg Leu Phe Leu Glu Gin Phe Ser He Pro Thr Cys Val Leu. 
275 280 285 

60 
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Asn Gly Glu Leu Pro Leu Arg Ser Arg Cys His lie lie Ser Gin Phe 
290 295 300 

Asn Gin Gly Phe Tyr Asp Cys Val lie Ala Thr Asp Ala Glu Val Leu 
5 305 310 315 320 

Gly Ala Pro Val Lys Gly Lys Arg Arg Gly Arg Gly Pro Lys Gly Asp 
325 330 335 

10 Lys Ala Ser Asp Pro Glu Ala Gly Val Ala Arg Gly He Asp Phe His 
340 345 350 



15 



His Val Ser Ala Val Leu Asn Phe Asp Leu Pro Pro Thr Pro Glu Ala 
355 360 365 

Tyr He His Arg Ala Gly Arg Thr Ala Arg Ala Asn Asn Pro Gly lie 
370 375 380 



Val Leu Thr Phe Val Leu Pro Thr Glu Gin Phe His Leu Gly Lys He 
20 385 390 395 400 

Glu Glu Leu Leu Ser Gly Glu Asn Arg Gly Pro He Leu Leu Pro Tyr 
405 410 415 

25 Gin Phe Arg Met Glu Glu He Glu Gly Phe Arg Tyr Arg Cys Arg Asp 
420 425 430 

Ala Met Arg Ser Val Thr Lys Gin Ala He Arg Glu Ala Arg Leu Lys 
435 440 445 



30 



Glu He Lys Glu Glu Leu Leu His Ser Glu Lys Leu Lys Thr Tyr Phe 
450 455 460 



Glu Asp Asn Pro Arg Asp Leu Gin Leu Leu Arg His Asp Leu Pro Leu 
35 465 470 475 480 

His Pro Ala Val Val Lys Pi3) His Leu Gly His Val Pro Asp Tyr Leu 
485 490 495 

40 Val Pro Pro Ala Leu Arg Gly Leu Val Arg Pro His Lys Lys Arg Lys 
500 505 510 

Lys Leu Ser Ser Ser Cys Arg Lys Ala Lys Arg Ala Lys Ser Gin Asn 
515 520 525 



45 



Pro Leu Arg Ser Phe Lys His Lys Gly Lys Lys Phe Arg Pro Thr Ala 
530 535 540 



Lys Pro Ser Xaa 
50 545 



(2) INFX)RMATXCN FOR SEQ ID NO: 250: 

55 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 299 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

60 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 250 
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Met Thr Thr Val Pro Pro Ser Pro Arg Pro Met Ser Arg Pro Ser Glu 
15 10 15 

Arg Asn Met Arg Arg Pro Arg Gly Pro Ser Pro Leu Pro Ala Ser Pro 
20 25 30 

Arg Asn Ser Thr Pro Asp Glu Pro Asp Val His Phe Ser Lys Lys Phe 
35 40 45 

Leu Asn Val Phe Met Ser Gly Arg Ser Arg Ser Ser Ser Ala Glu Ser 
50 55 60 



Phe Gly Leu Phe Ser Cys lie lie Asn Gly Glu Glu Gin Glu Gin Thr 
15 65 70 75 80 

His Arg Ala lie Phe Arg Phe Val Pro Arg His Glu Asp Glu Leu Glu 
85 90 95 

20 Jjexi Glu Val Asp Asp Pro Leu Leu Val Glu Leu Gin Ala Glu Asp Tyr 
100 105 110 



25 



Trp Tyr Glu Ala Tyr Asn Met Arg Thr Gly Ala Arg Gly Val Phe Pro 
115 120 125 

Ala Tyr Tyr Ala lie Glu Val Thr Lys Glu Pro Glu His Met Ala Ala 
130 135 140 



Leu Ala Lys Asn Ser Asp Trp Val Asp Gin Phe Arg Val Lys Phe Leu 
30 145 150 155 160 



35 



Gly Ser Val Gin Val Pro Tyr His Lys Gly Asn Asp Val Leu Cys Ala 
165 170 175 

Ala Met Gin Lys lie Ala Thr Thr Arg Arg Leu Thr Val His Phe Asn 
180 185 190 



40 



Pro Pro Ser Ser Cys Val Leu Glu lie Ser Val Arg Gly Val Lys lie 
195 200 205 

Gly Val Lys Ala Asp Asp Ser Gin Glu Ala Lys Gly Asn Lys Cys Ser 
210 215 220 



His Phe Phe Gin Leu Lys Asn lie Ser Kie Cys Gly Tyr His Pro Lys 
45 225 230 235 240 

Asn Asn Lys Tyr Phe Gly Kie lie Thr Lys His Pro Ala Asp His Arg 
245 250 255 

50 Phe Ala Cys His Val Phe Val Ser Glu Asp Ser Thr Lys Ala Leu Ala 
260 265 270 



55 



Glu Ser Val Gly Arg Ala Phe Gin Gin Phe Tyr Lys Gin Phe Val Glu 
275 280 285 

Tyr Thr Cys Pro Thr Glu Asp lie Tyr Leu Glu 
290 295 



60 
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(2) INFX)RMftTION FOR SEQ ID NO: 251: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LQ3GTH: 40 amino acids 
5 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 251: 

Leu Leu Tyr Leu Leu Lys Val Xaa Val lie Phe Val Phe Ser Ser Ser 
10 1 5 10 15 

Lys Gly Val Thr Leu Val Ser Met Asn Leu Thr Ser Phe Phe Val Ser 
20 25 30 

15 Ser Val Leu Ala Cys Phe Ser Xaa 
35 40 



20 (2) INPORMATICN FOR SEQ ID NO: 252: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGnTH: 594 amino acids 

(B) TYPE: amino acid 
25 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 252: 

Met Pro Ala Ser Ser I*eu Glu Ser Arg Ser Phe Leu Leu Ala Lys Lys 
15 10 15 

30 

Ser Gly Glu Asn Val Ala Lys Phe lie lie Asn Ser Tyr Pro Lys Tyr 
20 25 30 

Phe Gin Lys Asp lie Ala Glu Pro His lie Pro Cys Leu Met Pro Glu 
35 35 40 45 

Tyr Phe Glu Pro Gin lie Lys Asp lie Ser Glu Ala Ala Leu Lys Glu 
50 55 60 

40 Arg lie Glu Leu Arg Lys Val Lys Ala Ser Val Asp Met Phe Asp Gin 
65 70 75 80 

Leu Leu Gin Ala Gly Thr Thr Val Ser Leu Glu Thr Thr Asn Ser Leu 
85 90 95 

45 

Leu Asp Xaa Leu Cys Tyr Tyr Gly Asp Gin Glu Pro Ser Thr Asp Tyr 
100 105 110 

His Phe Gin Gin Thr Gly Gin Ser Glu Ala Leu Glu Glu Glu Asn Asp 
50 115 120 125 

Glu Thr Ser Arg Arg Lys Ala Gly His Gin Phe Gly Val Thr Trp Arg 
130 135 140 

55 Ala Lys Asn Asn Ala Glu Arg lie Phe Ser Leu Met Pro Glu Lys Asn 
145 150 155 160 



60 



Glu His Ser Tyr Cys Thr Met lie Arg Gly Met Val Lys His Arg Ala 
165 170 175 
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Tyr Glu Gin Ala Leu Asn Leu Tyr Thr Glu Leu Leu Asn Asn Arg Leu 
180 185 190 

His Ala Asp Val Tyr Thr Phe Asn Ala Leu He Glu Ala Thr Val Cys 
195 200 205 

Ala He Asn Glu Lys Phe Glu Glu Lys Trp Ser Lys He Leu Glu Leu 
210 215 220 

Leu Arg His Met Val Ala Gin Lys Val Lys Pro Asn Leu Gin Thr Phe 
225 230 235 240 

Asn Thr He Leu Lys Cys Leu Arg Arg Phe His Val Phe Ala Arg Ser 
245 250 255 

Pro Ala Leu Gin Val Leu Arg Glu Met Lys Ala He Gly He Glu Pro 
260 265 270 

Ser l.eu Ala Thr Tyr His His He He Arg Leu Phe Asp Gin Pro Gly 
275 280 285 

Asp Pro Leu Lys Arg Ser Ser Phe He He Tyr Asp He Met Asn Glu 
290 295 300 

Leu Met Gly Lys Arg Phe Ser Pro Lys Asp Pro Asp Asp Asp Lys Phe 
305 310 315 320 

Phe Gin Ser Ala Met Ser He Cys Ser Ser Leu Arg Asp Leu Glu Leu 
325 330 335 

Ala Tyr Gin Val His Gly Leu Leu Lys Thr Gly Asp Asn Trp Lys Phe 
340 345 350 

He Gly Pro Asp Gin His Arg Asn Phe Tyr Tyr Ser Lys Phe Phe Asp 
355 360 365 

Leu He Cys Leu Met Glu Gin. He Asp Val Thr Leu Lys Trp Tyr Glu 
370 375 380 

Asp Leu He Pro Ser Ala Tyr Phe Pro His Ser Gin Thr Met He His 
385 390 395 400 

Leu Leu Gin Ala Leu Asp Val Ala Asn Arg Leu Glu Val He Pro Lys 
405 410 415 

He Trp Lys Asp Ser Lys Glu Tyr Gly His Thr Phe Arg Ser Asp Leu 
420 425 430 

Arg Glu Glu He Leu Met Leu Met Ala Arg Asp Lys His Pro Pro Glu 
435 . 440 445 

Leu Gin Val Ala Phe Ala Asp Cys Ala Ala Asp He Lys Ser Ala Tyr 
450 455 460 

Glu Ser Gin Pro He Arg Gin Thr Ala Gin Asp Trp Pro Ala Thr Ser 
465 470 475 480 



Leu Asn Cys He Ala He Leu Phe Leu Arg Ala Gly Arg Thr Gin Glu. 

485 490 495 
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Ala Trp Lys Met Leu Gly Leu Phe Arg Lys His Asn Lys lie Pro Arg 
500 505 510 

Ser Glu Leu Leu Asn Glu Leu Met Asp Ser Ala Lys Val Ser Asn Ser 
515 520 525 

Pro Ser Gin Ala He Glu Val Val Glu Leu Ala Ser Ala Phe Ser Leu 
530 535 540 

Pro lie Cys Glu Gly Leu Thr Gin Arg Val Met Ser Asp Phe Ala He 
545 550 555 560 

Asn Gin Glu Gin Lys Glu Ala Leu Ser Asn Leu Thr Ala Leu Thr Ser 
565 570 575 

Asp Ser Asp Thr Asp Ser Ser Ser Asp Ser Asp Ser Asp Thr Ser Glu 
580 585 590 

Gly Lys 



(2) INFORMATION FOR SEQ ID NO: 253: 

(i) SEQUENCE CHARACTEEIISTXCS : 

(A) LENGTH: 131 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 253: 

Met Lys Leu Asn Leu Cys He Pro Asn Trp Ala Arg Cys Pro Leu Leu 
1 5 10 15 

Leu Leu Phe Pro Gin Leu Leu Pro Phe Gin Gly Glu Asp Asp Asp Pro 
20 25 30 

Leu Lys Ala Lys Ala Ala Asn Leu Val Glu Ala Val Pro Trp Gly He 
35 40 45 

Lys Ala Pro Ser Phe Gin Val Thr Cys Leu Val Arg Val Gin Leu Gin 
50 55 60 

Ser Cys Thr Pro Ser Arg Pro Ser Thr Leu Leu Ala Thr Ser Gin Ser 
65 70 75 80 

Pro Gly Arg He Ser Cys Tyr Ser Pro Leu Ser His Leu Pro Pro Val 
85 90 95 

Thr Thr Ser He Gin Pro Ser Pro Val Met Val Pro Phe Gin Tyr Gin 
100 105 110 

Ala Phe Leu Leu Gin Val Lys Glu Pro Ala Ala Gin Thr Leu Leu Gly 
115 120 125 



Gin Gin Xaa 
130 
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(2) INFOHMATICN FOR SEQ ID NO: 254: 

(i) SEQUEbJCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY; linear 

(xi) SEC^JENCE INSCRIPTION: SBQ ID NO: 254: 

Met Arg Tyr His Ala Gin Leu He Phe Cys He Phe Cys Xaa Phe Val 
15 10 15 

Phe Val Xaa Lys Xaa 
20 



{2> INFORMATION FOR SEQ ID NO: 255: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 255: 

Met Asn Asp Asn Ser Pro Asn His Ser Ser Ser Tyr Leu Pro Leu Pro 
15 10 15 

Leu Thr He Val He Leu Gin Thr Gly His Lys Gly Thr Leu Xaa 
20 25 30 



(2) INFORMATION FOR SEQ ID NO: 256: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 219 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 256: 

Met His Phe Leu Phe Arg Phe He Val Phe Phe Tyr Leu Trp Gly Leu 
15 10 15 

Phe Thr Ala Gin Arg Gin Lys Lys Glu Glu Ser Thr Glu Glu Val Lys 
20 25 30 

He Glu Val Leu His Arg Pro Glu Asn Cys Ser Lys Thr Ser Lys Lys 
35 40 45 

Gly Asp Leu Leu Asn Ala His Tyr Asp Gly Tyr Leu Ala Lys Asp Gly 
50 55 60 

Ser Lys Wie Tyr Cys Ser Arg Thr Gin Asn Glu Gly His Pro Lys Trp 
65 70 75 80 

Phe Val Leu Gly Val Gly Gin Val He Lys Gly Leu Asp He Ala Met 
85 90 95 



Thr Asp Met Cys Pro Gly Glu Lys Arg Lys Val Val He Pro Pro Ser 
100 105 110 
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Phe Ala Tyr Gly Lys Glu Gly Tyr Ala Glu Gly Lys lie Pro Pro Asp 
115 120 125 

Ala Thr Leu lie Phe Glu lie Glu Leu Tyr Ala Val Thr Lys Gly Pro 
130 135 140 

Arg Ser lie Glu Thr Phe Lys Gin lie Asp Met Asp Asn Asp Arg Gin 
145 150 155 160 

Leu Ser Lys Ala Glu lie Asn Leu Tyr Leu Gin Arg Glu Phe Glu Lys 
165 170 175 

Asp Glu Lys Pro Arg Asp Lys Ser Tyr Gin Asp Ala Val Leu Glu Asp 
180 185 190 

lie Phe Lys Lys Asn Asp His Asp Gly Asp Gly Phe lie Ser Pro Lys 
195 200 205 

Glu Tyr Asn Val Tyr Gin His Asp Glu Leu Xaa 
210 215 



(2) INFORMATION FOR SEQ ID NO: 257: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LQIGTH: 50 amino acids 

(B) TYPE: amino acid 
(D) TOPGLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 257: 

Met Trp Val lie Arg Val Phe Gin Lys Thr Phe Leu Phe Phe Val Leu 
15 10 15 

Phe Trp Ser Val His Cys lie Ser Asp Lys Phe Gly Cys Leu Trp His 
20 25 30 

Val Cys Met Lys Arg Glu Gly Asp Xaa Asn Cys Leu Ser Phe Ser Xaa 
35 40 45 

Leu Xaa 
50 



(2) INFORMATION FOR SEQ ID NO: 258: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 122 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SBQUEtKX DESCRIPTION: SEQ ID NO: 258: 

Met Pro Ser Gin Thr Glu Xaa Phe Ala Ala Cys Gly Gly His Ser Leu 
15 10 15 



Leu Leu Val Xaa Leu Pro Leu Gly Leu Pro Phe Cys Pro Arg Ala Ala. 
20 25 30 
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Leu Cys Asp Leu Pro Phe Ser Leu Pro Ser Phe Pro Gly Gin Ala Arg 
35 40 45 

Arg Gly Gly Ala Glu Lys Gin Gly Ala Glu Gly Arg Gly Leu Gin Val 
50 55 60 

Lys Pro Arg Gly Gin Arg Thr Phe Gin Val Ser Arg Thr Ala Pro Ala 
65 70 75 80 

Ala Pro Arg Ser Arg Gin Pro Arg Pro Pro Ala Ala Leu Pro Ala Leu 
85 90 95 

Gly Phe Gly Gly Arg Gly Val Ala Lys Gly Arg Phe Leu Cys Phe Trp 
100 105 110 

Cys Leu Tyr Met Leu Arg lie Asp Gin Xaa 
115 120 



(2) INFORMATION FOR SEQ* ID NO: 259: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 88 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTICN: SEQ ID NO: 259: 

Met Thr Ala Phe Cys Ser Leu Leu Leu Gin Ala Gin Ser I*eu Leu Pro 
1 5 . 10 15 

Arg Thr Met Ala Ala Pro Gin Asp Ser Leu Arg Pro Gly Glu Glu Asp 
20 . 25 30 

Glu Gly Met Gin Leu Leu Gin Thr Lys Asp Ser Met Ala Lys Gly Ala 
35 40 45 

Arg Pro Gly Ala Xaa Arg Gly Arg Ala Arg Trp Gly Leu Ala Tyr Thr 
50 55 60 

Leu Leu His Asn Pro Thr Leu Gin Val Phe Arg Lys Thr Ala Leu Leu 
65 70 75 80 

Gly Ala Asn Gly Ala Gin Pro Xaa 
85 



(2) INFORMATION FOR SEQ ID NO: 260: 

(i) SEQUENCE CHARACTERISTICS: 

(A) L^TTH: 26 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SE^yjENCE DESCRIPTION: SEQ ID NO: 260: 

Met He Gin Val Ser Val Pro Leu Leu Thr He Met He Phe Leu Leu 
15 10 15 

Tyr Leu Gin He Gly Pro Gly Lys Leu Xaa 
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20 25 



(2) INFORMATION FOR SEQ ID NO: 261: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 261: 

Met Leu Leu Asp Pro Phe He I,eu Leu Phe Cys Leu Phe Ser Thr Ala 
15 10 15 

Ala Gin Ser Cys Leu Glu Phe He Tyr He Gin Phe Xaa 
20 25 



(2) INFORMATION FOR SEQ ID NO: 262: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 262: 

Met Lys Phe Leu Ser He Leu Leu Asp Asp Asn Asn E*ie Xcia Leu Met 
15 10 15 

Leu Met Leu Ala Pro Phe Gly Cys Leu Ala Phe Glu Arg Ser Met Lys 
20 25 30 

Met Arg Asn Gly Ala Leu Gly Leu Glu Glu Val Xaa 
35 40 



(2) INFORMATiai FDR SEQ ID NO: 263: 

(i) SEQUENCE CHARACTEEIISTICS: 

(A) LQ3GTH: 363 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 263: 

Met Arg Thr Leu Phe Asn Leu Leu Trp Leu Ala Leu Ala Cys Ser Pro 
15 10 15 

Val His Thr Thr Leu Ser Lys Ser Asp Ala Lys Lys Ala Ala Ser Lys 
20 25 30 

Thr Leu Leu Glu Lys Ser Gin Phe Ser Asp Lys Pro Val Gin Asp Arg 
35 40 45 

Gly Leu Val Val Thr Asp Leu Lys Ala Glu Ser Val Val Leu Glu His 
50 55 60 

Arg Ser Tyr Cys Ser Ala Lys Ala Arg Asp Arg His Phe Ala Gly Asp 
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65 



70 



75 



80 



Val Leu Gly Tyr Val Thr Pro Trp Asn Ser His Gly Tyr Asp Val Thr 
85 90 95 

Lys Val Phe Gly Ser Lys Phe Thr Gin lie Ser Pro Val Trp Leu Gin 
100 105 110 

Leu Lys Arg Arg Gly Arg Glu Met Phe Glu Val Thr Gly Leu His Asp 
115 120 125 

Val Asp Gin Gly Trp Met Arg Ala Val Arg Lys His Ala Lys Gly Leu 
130 135 140 

His He Val Pro Arg Leu Leu Phe Glu Asp Trp Thr Tyr Asp Asp Phe 
145 150 155 160 

Arg Asn Val Leu Asp Ser Glu Asp Glu He Glu Glu I*eu Ser Lys Thr 
165 170 175 

Val Val Gin Val Ala Lys Asn Gin His Phe Asp Gly Phe Val Val Glu 
180 185 190 

Val Trp Asn Gin Leu Leu Ser Gin Lys Arg Val Thr Asp Gin Leu Gly 
195 200 205 

Met Phe Thr His Lys Glu Phe Glu Gin Leu Ala Pro Val Leu Asp Gly 
210 215 220 

Phe Ser Leu Met Thr Tyr Asp Tyr Ser Thr Ala His Gin Pro Gly Pro 
225 230 235 240 

Asn Ala Pro Leu Ser Trp Val Arg Ala Cys Val Gin Val Leu Asp Pro 
245 250 255 

Lys Ser Lys Trp Arg Ser Lys He Leu Leu Gly Leu Asn Phe Tyr Gly 
260 265 270 

Met Asp Tyr Ala Thr Ser Lys Asp Ala Arg Glu Pro Val Val Gly Ala 
275 280 285 

Arg Tyr He Gin Thr Leu Lys Asp His Arg Pro Arg Met Val Trp Asp 
290 295 300 

Ser Gin Xaa Ser Glu His Phe Phe Glu Tyr Lys Lys Ser Arg Ser Gly 
305 310 315 320 

Arg His Val Val Phe Tyr Pro Thr Leu Lys Ser Leu Gin Val Arg Leu 
325 330 335 

Glu Leu Ala Arg Glu Leu Gly Val Gly Val Ser He Trp Glu Leu Gly 
340 345 350 



Gin Gly Leu Asp Tyr Phe Tyr Asp Leu Leu Xaa 
355 360 



(2) INFORMATION FOR SEQ ID NO: 264: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 128 amino acids 

(B) TYPE: amino acid 
(D) TOPOtOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 264: 

Leu Pro Thr Lys lie Leu Val Lys Pro Asp Arg Thr Phe Glu lie Lys 
15 10 15 

lie Gly Gin Pro Thr Val Ser Tyr Phe Leu Lys Ala Ala Ala Gly He 
20 25 30 

Glu Lys Gly Ala Arg Gin Thr Gly Lys Glu Val Ala Gly Leu Val Thr 
35 40 45 

Leu Lys His Val Tyr Glu He Ala Arg He Lys Ala Gin Asp Glu Ala 
50 55 60 

Phe Ala he\x Gin Asp Val Pro Leu Ser Ser Val Val Arg Ser He He 
65 70 75 80 

Gly Ser Ala Arg Ser Leu Gly I lei Arg Val Val Lys Asp Leu Ser Ser 
85 90 95 

Glu Glu Leu Ala Ala Phe Gin Lys Glu Arg Ala He Phe Leu Ala Ala 
100 105 110 

Gin Lys Glu Ala Asp Leu Ala Ala Gin Glu Glu Ala Ala Lys Lys Xaa 
115 120 125 



(2) INFORMATION FOR SEQ ID NO: 265: 

(i) SEQUENCE CHARACTEEaSTICS : 

(A) LENGTH: 54 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCEUPTION: SEQ ID NO: 265: 

Met Leu Leu Gin He His Pro Leu Leu Pro Ser Pro Thr He Pro His 
15 10 15 

He Leu Leu Leu Phe Leu Tyr Pro Thr Phe Ser He Leu Glu His Ser 
20 25 30 

Cys Ser Tyr Cys He Glu Tyr Leu Trp Val Cys Leu Leu Phe Cys Leu 
35 40 45 

Ser Leu Trp Phe Leu Xaa 
50 



(2) INFORMATION FOR SEQ ID NO: 266: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 29 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 266: 

Met Cys Leu Trp Cys Cys Gly Asp Val Cys Ser Gly Leu Ser Ser Leu 
15 10 15 

Leu Ser Leu Cys Val Cys Cys Val Val Leu Ala Val Cys 
20 25 



(2) INPORMATION FOR SEQ ID NO: 267: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 267: 

Glu Gly Leu Arg Leu Leu Leu Ser Leu Pro Ala Ala Leu Pro Arg Ser 
15 10 15 

Cys Cys His Pro Arg Trp Leu Pro Val Xaa 
20 25 



(2) INFORMATION FOR SEQ ID NO: 268: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 221 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 268: 

Met J^e His Gly lie Pro Ala Thr Pro Gly lie Gly Ala Pro Gly Asn 
15 10 15 

Lys Pro Glu Leu Tyr Glu Glu Val Lys Leu Tyr Lys Asn Ala Arg Glu 
20 25 30 

Arg Glu Lys Tyr Asp Asn Met Ala Glu Leu PtiB Ala Val Val Lys Thr 
35 40 45 

Met Glri Ala Leu Glu Lys Ala Tyr lie Lys Asp Cys Val Ser Pro Ser 
50 55 60 

Glu Tyr Thr Ala Ala Cys Ser Arg Leu Leu Val Gin Tyr Lys Ala Ala 
65 70 75 80 

Phe Arg Gin Val Gin Gly Ser Glu He Ser Ser He Asp Glu Phe Cys 
85 90 95 

Arg Lys Phe Arg Leu Asp Cys Pro Leu Ala Met Glu Arg He Lys Glu 
100 105 110 



Asp Arg Pro He Thr He Lys Asp Asp Lys Gly Asn Leu Asn Arg Cys 
115 120 125 
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lie Ala Asp Val Val Ser Leu Phe 
130 135 

Leu Glu lie Arg Ala Met Asp Glu 
145 150 

Met Glu Thr Met His Arg Met Ser 
165 

Arg Gin Thr Val Ser Gin Trp Leu 
180 



lie Thr Val Met Asp Lys Leu Arg 
140 

lie Gin Pro Asp Leu Arg Glu Leu 
155 160 

His Leu Pro Pro Asp Phe Glu Gly 
170 175 

Gin Thr Leu Ser Gly Met Ser Ala 
185 190 



Ser Asp Glu Leu Asp Asp Ser Gin Val Arg Gin Met Leu Phe Asp Leu 
195 200 205 

Glu Ser Ala Tyr Asn Ala Phe Asn Arg Phe Leu His Ala 
210 215 220 



(2) INFORMATION FOR SBQ ID NO: 269: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LQKTTH: 3 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEC^JENCE DESCRIPTION: SEQ ID NO: 269: 

Met Lys Xaa 



(2) INFORMATION FOR SEQ ID NO: 270: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTICN: SEQ ID NO: 270: 

Met Gin Ala Pro Phe Xaa His Phe Ser Phe Arg Met Phe Ser Asn Leu 
15 10 15 

Tyr Cys Phe Ser Asp Phe Gin Pro Asn lie Ser Pro Cys Pro Leu Cys 
20 25 30 

His Cys lie Leu Pro Xaa His His His Val Phe Leu Leu Leu Ala Val 
35 40 45 

Xaa 



(2) INFORMATION FOR SEQ ID NO: 271: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LEtVGTH: 52 amino acids 
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(B) TYPE: aittino acid 
(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 271: 

Met Lys Leu Val Thr Met Phe Asp Lys Leu Ser Arg Asn Arg Val lie 
15 10 15 

Gin Pro Met Gly Met Ser Pro Arg Gly His Leu Thr Ser Leu Gin Asp 
20 . 25 30 

Ala Met Cys Glu Thr Met Glu Gin Gin Leu Ser Ser Asp Pro Asp Ser 
35 40 45 

Asp Pro Asp Xaa 
50 



(2) INFORMATION FOR SEQ ID NO: 272: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 272: 

Met Ala Val Gly Glu Ala Val Phe Val Pro Leu Gin His Pro Pro Leu 
15 10 15 

Leu His Gly Ser Pro- lie Pro Lys Leu Leu Pro Gly Pro Leu Leu Xaa 
20 25 30 



(2) INFORMATION FOR SEQ ID NO: 273: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 57 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 273: 

Met Asn Gly Cys His Arg Arg Lys Arg Leu His Leu Cys Lys Thr lie 
15 10 15 

Tyr Leu Leu Trp Phe Val Phe Ser Phe Leu Leu Ser Asn Glu Veil Val 
20 25 30 

Ser Ser His Trp His lie Leu Arg Ala Val Gin He He Cys Thr Leu 
35 40 45 

Phe His Arg Xaa He Ser Ala Phe Xaa 
50 55 



(2) INFORMATION FOR SEQ ID NO: 274: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 274: 

Met Gly Trp Val Ser Ser Pro His Val Lys Arg Arg Glu Cys Val Leu 
15 10 15 

Lys Lys Pro Phe Phe Xaa 
20 



(2) INFORMATION FOR SEQ ID NO: 275: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENCTTH: 51 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 275: 

Met Phe Asn Phe Phe Lys Asn Pro Leu Leu Thr Cys Leu Phe lie Ser 
15 10 15 

Cys Tyr Leu Tyr Leu Ser Leu Leu Val Asn Lys Val Leu Phe Ala Glu 
20 25 30 

Glu Gly Leu Cys Cys Thr Tyr Cys Thr Thr Ser Asn Thr Gly Glu Gly 
35 40 45 

Gly Val Xaa 
50 



(2) INFORMATION FOR SEQ ID NO: 276: 

(i) SEQUENCE CHARACTERISTICS: 

(A) VENCfTH: 2 amino acids 

(B) TYPE: amino cLcid 
(D> TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 276: 

Met Xaa 
1 



(2) INFORMATION FOR SEQ ID NO: 277: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 66 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTIC»l : SEQ ID NO: 277: 



Met Leu Cys Thr lie Leu Thr Val Val He He He Ala Ala Gin Thr 
15 10 15 
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Thr Arg Thr Thr Gly He Pro Lys Asn Ala Pro Gly Pro Ala Pro Leu 
20 25 30 

5 Cys Ala Pro Arg Ser Pro Arg Leu Phe Leu Gin Xaa Tyr Arg Gly Pro 
35 40 45 

Asn Gly Arg Pro Ala His Pro Phe Leu Gly Pro Ser Asp Leu Asp Thr 
50 55 60 

10 

Ser Xaa 
65 



15 



35 



50 



(2) INFX5RMATI0N FOR SEQ ID NO: 278: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LEN3TH: 257 amino acids 
20 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 278: 

Met Leu Gly Ala Lys Pro His Trp Leu Pro Gly Pro Leu His Ser Pro 
25 1 5 10 15 

Gly Leu Pro Leu Val Leu Val Leu Leu Ala Leu Gly Ala Gly Trp Ala 
20 25 ^30 

30 Gin Glu Gly Ser Glu Pro Val Leu Leu Glu Gly Glu Cys Leu Val Val 
35 40 45 



Cys Glu Pro Gly Arg Ala Ala Ala Gly Gly Pro Gly Gly Ala Ala Leu 
50 55 60 

Gly Glu Ala Pro Pro Gly Arg Val Ala Phe Xaa Ala Val Arg Ser His 
65 70 75 80 



His His Glu Pro Ala Gly Glu Thr Gly Asn Gly Thr Ser Gly Ala lie 
40 85 90 95 

Tyr Phe Asp Gin Val Leu Val Asn Glu Gly Gly Gly Phe . Asp Arg Ala 
100 105 110 

45 Ser Gly Ser Phe Val Ala Pro Val Arg Gly Val Tyr Ser Phe Arg Phe 
115 120 125 



His Val Val Lys Val Tyr Asn Arg Gin Thr Val Gin Val Ser Leu Met 
130 135 140 

Leu Asn Thr Trp Pro Val lie Ser Ala Phe Ala Asn Asp Pro Asp Val 
145 150 155 160 



Thr Arg Glu Ala Ala Thr Ser Ser Val Leu Leu Pro Leu Asp Pro Gly 
55 165 170 175 

Asp Arg Val Ser Leu Arg Leu Arg Arg Gly Xaa Ser Thr Gly Trp Leu 
180 185 190 

150 Gill He XiBU Lys Phe Leu Trp Leu Pro His Leu Pro Ser Leu Lys Asp 



BEST AVAIUBLE COPY 
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195 2:0 205 

Pro Ser Leu 3er Ser Thr Arg lie Gin Pro Leu Thr Thr Phe Phe Cys 
210 215 220 

5 

Pro I^eu leu Pro Xaa Lvs Gin :Caa I*ys Gin Xaa Xaa Xaa Ser Leu Trp 
225 23: 235 240 

Leu 1/eu Ser His Leu Phe Ala ~rp Glu Pro Val Pro Asn Thr Gin Val 
10 245 250 255 

Xaa 



15 



30 



(2) INPOHiffrTON FOP. SEQ ID NC: 279: 



20 (A) IzrxnH: 103 ana no acids 

(3) TYPE: azdnc acid 
(D) '30PCLCG'/: lir.ear 
(xi) SECuEJX:E LESCRXPTICiJ: SEQ ID NO: 279: 

25 Me" Ala Pro Arg Ala Leu. Pro Gly Ser Ala Val Leu Ala Ala Ala Val 
is 10 15 

Phe Val Gly Gly Ala Val Ser Ser Pro Leu Val Ala Pro Asp Asn Gly 
20 25 30 



Ser Ser Arg Thr Leu Kis Ser Arg Tlir Glu Itir Thr Pro Ser Pro Ser 
35 43 45 



Asn Asp Thr Gly Asn Gly His Pro Glu Tyr lie Ala Tyr Ala Leu Val 
35 50 55 60 

Pro Val Phe Phe lie Mer Giy Leu Phe Gly Val Leu He Xaa Pro Xaa 
65 7 J 75 80 

40 Xaa 'S'aa Lys Lys Lys Gly Tyr Ajrg Cys Thr Tbr Glu Ala Glu Gin Asp 

85 90 95 

He Glu Glu Glu Lys Gly Xaa 
100 

45 



(2) INFCHMAinai PGR SEQ ID NC: 280: 

50 (i) SECraiCE CHARACTEPJCSnCS: 

?A) LaJGTH: 33 amino acids 
(3) TYPE: emirjo acid 
(D) TOPCLCXJY: linear 
(:ci) SEQOENCE EHSCRIPTIGN: SEQ ID NO: 280: 

55 

Met Pro Yal Thr Leu Ser Ser Leu Gly Phe Trp Val Leu Leu Ser Leu 
1 5 .10 15 



60 



Leu Phe Pre Trp Arg Thr Asp Gin Gly Cys Gly Pro Ala Thr Cys Tyr 
20 25 30 



wo 98/54963 



512 



Xaa 



(2) INFORMATION FOR SEQ ID NO: 281: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 281: 

Met Val Leu Gly Leu Leu Leu Leu Leu Xaa Phe Phe Ser Phe Ser Ser 
15 10 15 

Ser Pro Ser Pro Ser Ser Ser Leu Leu Leu Leu Ser Ser Phe Phe Phe 
20 25 30 

Gin Ser Leu Ala Leu Ser Pro Arg Leu Glu Xaa 
35 40 



(2) INTORMATICN FOR SEQ ID NO: 282: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY; linear 

(xi) SEQUETKZE DESCRIPTION: SEQ ID NO: 282: 

Glu Trp Leu Val Phe Thr Phe Leu Leu Val Phe Gly Ser Pro Leu Gly 
15 10 15 

Lys Gly Pro Leu Xaa 
20 



(2) INFORMATION FOR SEQ ID NO: 283: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENCrni: 70 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SBCPENCE DESCRIPTION: SEQ ID NO: 283: 

- Met lie Arg Ala Leu Ser Leu Phe Leu Leu lie Phe Asp Ala Ala Leu 
15 10 15 

Phe Ser Leu Ser Val Phe Val Phe lie Gly His Leu Leu Pro Met Pro 
20 25 30 

Lys Gly Thr Gly Leu His Ser Cys Ala Lys His Leu lie Lys Ser Leu 
35 40 45 

Lys Glu Asn Val Leu Pro Leu Met Asn Tyr Pro Asp Cys Lys L.eu Lys 
50 55 60 
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He Asn lie Ser Pro Xaa 
65 70 



(2) INFXDRMATICN FOR SEQ ID NO: 284: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 75 amino acids 

(B) TVPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 284: 

Met Gly Lys Leu He Arg Leu Ser Val Met Val Met Ser Val Arg Arg 
15 10 15 

Leu Phe Ser He Tyr Trp Val Leu Ser Thr Val Pro Asp Ala Val Gly 
20 25 30 

Ser Arg Gly Gly Met Glu Glu Glu Cys Ser Arg Gly Leu Cys Cys Val 
35 40 45 

Ala Gly Gin His Lys Gin Ala Lys Gly Lys Arg Gin Ala Trp Asn Lys 
50 55 60 

Gly Gly Glu Tyr Gin Cys Val Thr Tyr Cys Xaa 
65 70 75 



(2) INFORMATION FOR SEQ ID NO: 285: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LQXnH: 33 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCREPTICN: SEQ ID NO: 285: 

Met Pro Ala Leu Val Thr Leu Leu Leu Leu Phe Pro Leu Leu Pro Leu 
15 10 15 

Met Glu Ala Ser Cys His Val Met Arg Cys Pro Met Glu Arg Pro Thr 
20 25 30 

Xaa 



(2) INFORMATION FOR SEQ ID NO; 286: 

(i) SEQUENCE CHARACTERISTICS': 

(A) LQ9GTH: 17 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 286: 

Glu Ala Pro Trp Gly Leu Leu Lys Leu Leu Leu Leu Leu Ala Val Phe 
15 10 15 
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Xaa 



(2) INFORMATKXJ FOR SEQ ID NO: 287: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 287: 

Met Gin Gin Lys Gin Lys Lys Ala Asn Glu Lys Lys Glu Glu Pro Lys 
15 10 15 

Xaa 



(2) INFORMATION FOR SEQ ID NO: 288: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 288: 

Met Gin Arg Lys Val Ser Asp Phe lie lie His Gin Arg Leu Thr Val 
1 5 10 15 

Asn Leu Cys Val lie Ser Phe Phe Phe Phe Leu Pro He Cys He Phe 
20 25 30 

Ser Leu Ala Lys Lys Xaa 



(2) INFORMATION FOR SEQ ID NO: 289: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCEaPTION: SEQ ID NO: 289: 

Met Ala Leu Leu He Ser Ser Leu He Trp Ser Xaa. 
15 10 



(2) INFORMATION FOR SEQ ID NO: 290: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LQJGTH: 35 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 290: 

Met Gin Met Phe Thr Val Ser Leu Leu Leu Ser Leu Leu Leu Arg Ser 
15 10 15 

5 

Thr Asp Gin Asn His Leu Gin Leu Leu Val Gly Arg Glu Asp His Tyr 
20 25 30 

Gly Gly Xaa 
10 35 



15 



25 



40 



(2) INFORMATION FOR SEQ ID NO: 291: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 291: 

Met Ser Glu Ser Ala Cys lie Leu Asn Asn Gin Lys Glu Leu Xaa 
15 10 15 



(2) INFORMATION FOR SEQ ID NO: 292: 



(i) SEQUENCE CHARACTERISTICS: 
30 (A) LQ3GTH: 44 amino acids 

{B) TYPE: amino acid 

(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 292: 

35 Met Asp Leu Asp Arg Val Lys Ala Glu Ala Thr Glu Asp lie Thr Ser 
15 10 15 

Gly Val Leu Cys Leu Leu Phe Leu Arg Leu Pro Pro Asn Ser Cys lie 
20 25 30 



Phe Pro Ser Ala Val Leu Gly Ser Thr Arg Thr Xaa 
35 40 



45 

(2) INFORMATION FOR SEQ ID NO: 293: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LOIGTH: 136 amino acids 
50 (B) TYPE: amino acid 

<D) TOPOLOGY; linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 293: 

Val Val Gly Thr Gly Thr Ser Leu Ala Leu Ser Ser Leu Leu Ser Leu 
55 1 5 10 15 

Leu Leu Phe Ala Gly Met Gin Met Tyr Ser Arg Gin Leu Ala Ser Thr 
20 25 30 

60 Glu Trp Leu Thr lie Gin Gly Gly Leu Leu Gly Ser Gly Leu Phe Val 
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35 40 45 

Phe Ser Leu Thr Ala Phe Asn Asn Leu Glu Asn Leu Val Phe Gly Lys 
50 55 60 

Gly Phe Gin Ala Lys lie Phe Pro Glu lie Leu LeU Cys Leu Leu Leu 
65 70 75 80 

Ala Leu Phe Ala Ser Gly Leu lie His Arg Val Cys Val Thr Thr Cys 
85 90 95 

Phe He Phe Ser Met Val Gly Leu Tyr Tyr He Asn Lys He Ser Ser 
100 105 110 

Thr Leu Tyr Gin Ala Ala Ala Pro Val Leu Thr Pro Ala Lys Val Thr 
115 120 125 

Gly Lys Ser Lys Lys Arg Asn Xaa 
130 135 



(2) INFORMATION FOR SEQ ID NO: 294: 

(i) SEQUENCE CHARACTERISriCS : 

(A) LENGTIH: 34 amino acids 

(B) TVPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 294: 

Met Phe He Phe Leu Phe Leu Cys Val Leu Ser Arg Lys He Gin Glu 
15 10 15 

Glu Tyr Tyr Arg Leu Phe Lys Asn Val Pro Cys Cys Phe Gly Cys Leu 
20 25 30 

Arg Xaa 



(2) INFORMATION FOR SEQ ID NO: 295: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 137 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 295: 

Met Arg Thr Pro Gly Pro Leu Pro Val Leu Leu Leu Leu Leu Ala Gly 
15 10 15 

Ala Pro Ala Ala Arg Pro Thr Pro Pro Thr Cys Tyr Ser Arg Met Arg 
20 25 30 

Ala Leu Ser Gin Glu He Thr Arg Asp Phe Asn Leu Leu Gin Val Ser 
35 40 45 



Glu Pro Ser Glu Pro Cys Val Arg Tyr Leu Pro Arg Leu Tyr Leu Asp 
50 55 60 
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He His Asn Tyr Cys Val Leu Asp Lys Leu Arg Asp Phe Val Ala Ser 
65 70 75 80 

Pro Pro Cys Trp Lys Val Ala Gin Val Asp Ser Leu Lys Asp Lys Ala 
85 90 95 

Arg Lys Leu Tyr Thr lie Met Asn Ser Phe Cys Arg Arg Asp Leu Val 
ICQ 105 110 

Phe Leu Leu Asp Asp Cys Asn Ala Leu Glu Tyr Pro He Pro Val Thr 
115 120 125 

Thr Val Leu Pro Asp Arg Gin Arg Xaa 
130 135 



<2) INFORMATION FOR SEQ ID NO: 296: 

(i) SEQUENCE CHARACTEEUSTICS : 

(A) LENGTH: 58 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 296: 

Met Trp Leu Leu Lys Pro Ser Ala His Ser Pro Val His Xaa Leu Val 
15 10 15 

Leu Leu Phe Pro Arg Gly Trp Ser Gin Pro Gly Thr His Lys Arg Gin 
20 25 30 

He Leu Val Asn Xaa Ala Ser Leu Pro Gly Gly Cys Leu Leu Pro Trp 
35 40 45 

He Trp Ser Gly Ala Ala Leu Arg Phe Xaa 
50 55 



(2) INFORMATION FOR SEQ ID NO: 297: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LQ9GTH: 35 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 297: 

Met Ser Arg Arg Ala Glu Ala Ser He Phe Val Leu Pro Lys Thr Leu 
15 10 15 

Leu Phe Val Leu Phe Pro Ala Phe Pro Ser Pro Ala Val Gly Cys Pro 
20 25 30 

Val Pro Xaa 
35 



(2) INFORMATION FOR SEQ ID NO: 298: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 78 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 298: 



Ser Cys Tyr lie Thr Pro Trp Ser Lys lie Gin Ser Phe Ser Leu Ser 
15 10 15 

Leu Phe Gin Phe lie Leu Gin Glu Val Asn lie Thr Leu Pro Glu Asn 
20 25 30 

Ser Val Trp Tyr Glu Arg Tyr Lys Phe Asp He Pro Val Phe His Leu 
35 40 45 

Asn Gly Gin Phe Leu Met Met His Arg Val Asn Thr Ser Lys Leu Glu 
50 55 60 

Lys Gin Leu Leu Lys Leu Glu Gin Gin Ser Thr Gly Xaa Xaa 
65 • 70 75 



(2) INFORMATION FOR SEQ ID NO: 299: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 95 amino acids 

(B) TYPE; amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 299: 

Met Phe Val Leu Phe Ser Leu Pro Lys Tyr Ala Gly Leu Arg Leu Pro 
15 10 15 

He Pro Gly Leu Ser Ala Leu Leu Val Phe Leu Leu Ser Leu Phe Ser 
20 25 30 

Arg Arg Ala Gin Val Glu Leu Thr Thr Gly Arg Glu Thr Leu Pro Lys 
35 40 45 

Asn Leu Gin Gly Tyr Phe Pro Glu Phe Gly Phe Gin Val Gin Asn Phe 
50 55 60 

Leu Ser Cys Lys He Tyr Ala Ala Ser Gin Lys Gin Pro Leu Pro Pro 
65 70 75 80 

Leu Tyr Gin Leu Arg Phe Tyr Leu Lys His Met Gly Leu Pro Xaa 
85 90 95 



<2) INFORMATION FOR SEQ ID NO: 300: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 amino acids 

(B) TYPE: amino acid 
<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 300: 
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Met Ser Ser His Trp Thr Leu Lys He Leu Leu Val Pro Leu Phe Tyr 
15 XO 15 

Leu Ser Leu Glu Phe Pro Ser Gly Phe Val Leu Cys Leu Ala Asn Asp 
5 20 25 30 

Leu Gly Tyr His Phe Ser Ser Arg Val Arg Ser Xaa 
35 40 

10 

(2) INFORMATION FOR SEQ ID NO: 301: 

(i) SBQUEWCE CHARACTERISTICS: 
15 (A) LEtKTTH: 31 amino acids 

(B) TTfPE: amino acid 

(D) TOPOLOGY: linear 
(xi) SEC2UENCE DESCRIPTION: SEQ ID NO: 301: 

20 Met Leu Val Val Asn He Asn Leu Val Phe Leu Leu Phe Phe He Phe 
1-5 10 15 

Leu Cys Tyr Leu Asp Ala cys He Asn Val Phe Cys Phe Tyr Xaa 
20 25 30 

25 



(2) INFORMATION FOR SEQ ID NO: 302: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LEMGTTH: 113 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SBQUEWCE DESCRIPTION: SEQ ID NO: 302: 



35 



Met Pro Val Leu Pro Gly Arg Thr Thr Ala Leu Leu Ser Leu Thr Leu 
15 10 15 



Ala Phe Ala Val Pro Cys Ser Gly Val Glu Ala Gly Pro Cys Val Pro 
40 20 25 30 

Arg Ser His Gly Cys Ser Ser Trp Glu Ala Ser Val Cys Val Thr Ser 
35 40 45 

45 Ser Thr Pro Gly Gly Ser Trp Arg Ala Arg Ala Leu Phe Pro Ser Ala 
50 55 60 

Ala Trp His Arg Xaa Ala Ala Trp Asp Ser Pro Trp Thr Gin Thr Gly 
65 70 75 80 

50 

Asp Phe Ala Arg Gly Ala Met Gly Gly Ala Gly Ala Leu Pro Gly Gly 
85 90 95 

Cys Val Cys He Ser Gly Arg Pro Arg Ala Gin Lys Leu Pro Ala Leu 
55 100 105 110 

Xaa 



60 
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(2) INFORMATION FOR SEQ ID NO: 303: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 
(D) TOPOLCGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 303: 

Thr His He His Thr His He He He Cys Ser Ser Val Xaa 
15 10 



(2) INFORMATION FOR SEQ ID NO: 304: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LQXTTH: 35 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 304:' 

Met Glu Asn Phe Phe Phe Ser Phe Tyr Leu Phe Leu He Thr Leu He 
15 10 15 

Pro Asn Gly Arg Thr I^eu Ser Thr Thr Ala Asp His Cys Lys He Pro 
20 25 30 

cys He Xaa 
35 



«2) INFORMATION FOR SEQ ID NO: 305: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: ^35 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 305: 

Met Glu Leu Trp Glu Leu Ala Leu Cys I*eu Leu Val Ala Leu Ser Ala 
15 10 15 

His Met Phe Thr Val Gin Leu Leu Ala Asp Leai Gly Phe Leu Phe Gly 
20 25 30 

Gly Phe Xaa 
35 



(2) INFORMATIC»I FOR SEQ ID NO: 306: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LEMTIH: 82 amino acids 

(B) TVPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTICN: SBQ ID NO: 306: 
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Met Gly Ala 3ly He leu Ala Leu Leu Leu Pro Leu Glu Ser Val Leu 
15 10 15 

Thr Cys Ser Vrz lie Ser Val Ser Thr Ser Glu Arg Gin Leu Trp Gin 
2C 25 30 

Ser Ser Gin lys Ala Thr He leu Ser Leu Lys Leu Asp Ser Cys Phe 
55 40 45 

Cys Gly His Ser Gly leu Lys Gly Lys Asn Glu Asp Thr Asp Ser Ser 
50 55 60 

Val Pro lie He Pre Ser Lys Tnr His Thr Kis Leu Gly Lys His Leu 
55 73 75 80 

He Xaa 



(2) IMFCHMATICM FCr. S£Q ID 170: 307: 

(i) SEQL-HNCE CHARACTEHISTICS : 

(A) 1S3GTH: 72 amino acids 

(3) T/PZ: anino acid 

(D) TCPCLOGZ: linear 
{xi) SEQ'JEITCZ Z:3SC?J:?TICM: SEQ ID NO: 307: 

Kst Phe lyr Phe Val Leu Phe He Tyr Ser Ser Ser Glu Thr Trp Ser 
15 10 15 

Gly Ser Val Ala Gin As? Gly Val His Gly Val He He Gly His Cys 
20 . 25 30 

Ser Val Glu Leu Pro 'Gly Ser Gly Asp Pro Pro Ala Ser Ala Xaa Leu 
35 40 45 

Val Ala Gly Thr He Gly Thr Cys Pro Tlir Met Pro Gly Phe Val Tyr 
50 55 60 

Phe Leu Asn Asp Val Xaa Asn Xaa 
65 70 



(2) IMFCPMATICK PGH SSQ ID 530: 308: 

(i) S3QC5NCE CHARACTERISnCS : 

(A) IHMTTK: 34 amino acids 

(3) "irypS: amino acid 

(D) 'TOPOLOGY: linear 
{xi) SS2UE15CE DESCRIPTION: SEQ ID NO: 308: 

Met Asp Ser Thr Le- Arg Gin Gly Arg Xaa Leu Leu Thr Leu Val Pro 
15 10 15 

Ala Ser Leu Phe Ser Leu Thr Leu Gly Gly Pro Gly Pro Trp Lys Asp 
20 25 30 



Pro Xaa 
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(2) INFORMATION FOR SEQ ID 1^: 309: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 115 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTIOI : SEQ ID NO: 309: 

Met Gin Val Val Gly Ser Trp Pro Gly Arg Val Gly Val Val Gly Leu 
15 10 15 

Ala Phe Ser Leu Val He Pro Pro Pro Ala He Cys He Ala Gly Pro 
20 25 30 

Ala Pro Gly Leu Gly Gly Gly Glu Arg Gin Gin Lys Gly Leu Gly Arg 
35 40 45 

Gly Gly Gly Gly Leu Arg Asn Cys Pro Gly Arg Val Gly Met Ala Ala 
50 55 60 

Glu Pro Gly Ala Leu I*eu Cys Leu Thr Ser Arg Asp Gly Ser Leu Leu 
65 70 75 80 

Leu Ser Cys Val Arg Pro His His Val He Lys Pro Lys Gly Thr Ala 
85 90 95 

Lys Lys Lys Lys Lys Lys Lys Lys Lys Lys Lys Lys Lys Lys Xaa Xaa 
ICQ 105 110 

Gly Gly Xaa 
115 



(2) INFORMATION FOR SEQ ID NO: 310: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 108 aioino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY I linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 310: 

Met Asp Leu Pro Gin Phe He Tyr Leu Phe He Phe Cys Phe Cys Cys 
15 10 15 

Leu Ala He Val Asn Asn Ala Ser He Asn He His He Gin Val Ser 
20 25 30 

Met Trp Leu Tyr Val Phe He Ser Leu Gly Tyr Leu His Gly Ser Arg 
35 40 45 

He Leu Gly His Asn He He Leu Cys Leu Thr Ser Gin Arg He Ala 
50 55 60 

Lys Arg Phe Phe He Val Ala Ala Ser Phe Thr Phe Pro Pro Ala Met 
65 70 75 80 
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Tyr Lys Asp Phe Tyr Phe Ser He Ser Leu His Leu E>ro Thr Leu Leu 
85 90 95 

5 Phe Xaa Xaa Xaa Phe Val Phe Ser Leu Leu Pro Pro 
100 105 



10 (2) INFORMATION FOR SEQ ID NO: 311: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGra: 65 airdno acids 

(B) TlfPE: amino aicid 
15 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 311 



20 



Met Cys Ser Pro Ser Leu Ser Ser Ser Pro Pro Pro Leu Leu Gin Val 
15 10 15 

Phe Phe Phe Phe Phe Phe Ser Pro His Trp Ala Ala Lys Val Val Pro ' 
20 25 30 



Gin Trp Lys Xaa Arg His Pro Gin Val Ser Ser Gin Leu Leu Leu Cys 
25 35 40 45 

Phe Leu Arg Val Asn Cys Gin Phe Leu Phe Leu Gin Glu He Leu Phe 
50 55 60 

30 Xaa 
65 



35 (2) INFORMATION FOR SEQ ID NO; 312: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: *50 amino acids 

(B) TYPE: amino acid 
40 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 312: 

Met Cys Leu Ser Arg Trp Lys He Phe Tyr Thr Leu Leu He Leu Phe 
15 10 15 

45 

Xaa Xaa Phe Ser He Thr Ser Glu Xaa Glu Thr Phe Tyr Met He He 
20 25 30 

He His His Asn Pro Thr Gin He Thr Ala Ser Cys Ser Phe Thr Phe 
50 35 40 45 

Leu Xaa 
50 

55 

(2) INFORMATION FOR SEQ ID NO: 313: 



60 



(i) SEQUENCE CHARACTERISTICS: 

(A) LQXjTH: 293 amino acids 
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10 



(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 313: 

Met Glu Arg Pro Asp Tirp Glu Thr Ala He Gin Lys Pro Leu Cys Ser 
15 10 15 

Leu Pro Ala Gly Ser Gly Asn Ala Leu Ala Ala Ser Leu Asn His Tyr 
20 25 30 

Ala Gly Tyr Xaa Gin Val Thr Asn Glu Asp Leu Leu Thr Asn Cys Thr 
35 40 45 



Leu Leu Leu Cys Arg Arg Leu Leu Ser Pro Met Asn Leu heu Ser Leu 
15 50 55 60 

His Thr Ala Ser Gly Leu Arg Leu Phe Ser Val Leu Ser Leu Ala Trp 
65 70 75 80 

20 Gly Phe He Ala Asp Val Asp Leu Glu Ser Glu Lys Tyr Arg Arg Leu 

85 90 95 



25 



Gly Glu Met Arg Phe Thr Leu Gly Thr Phe Leu Arg Leu Ala Ala Leu 
100 105 110 

Arg Thr Tyr Arg Gly Arg Leu Ala Tyr Leu Pro Val Gly Airg Val Gly 
115 120 125 



Ser Lys Thr Pro Ala Ser Pro Val Val Val Gin Gin Gly Pro Val Asp 
30 130 135 140 

Ala His Leu Val Pro Leu Glu Glu Pro Val Pro Ser His Trp Thr Val 
145 150 155 160 

35 Val Pro Asp Glu Asp Phe Val Leu Val Leu Ala Leu Leu Hdls Ser His 

165 170 175 

Leu Gly Ser Glu Met Phe Ala Ala Pro Met Gly Arg Cys Ala Ala Gly 
180 185 190 

40 

Val Met His Leu Phe Tyr Val Arg Ala Gly Val Ser Arg Ala Met Leu 
195 200 205 

Leu Arg Leu Phe Leu Ala Met Glu Lys Gly Arg His Met Glu Tyr Glu 
45 210 215 220 

Cys Pro Tyr Leu Val Tyr Val Pro Val Val Ala Phe Arg I^u Glu Pro 
225 230 235 240 

50 Lys Asp Gly Lys Gly Val Phe Ala Val Asp Gly Glu Leu Met Val Ser 

245 250 255 

Glu Ala Val Gin Gly Gin Val His Pro Asn Tyr Phe Trp Met Val Ser 
260 265 270 

55 

Gly Cys Val Glu Pro Pro Pro Ser Trp Lys Pro Gin Gin Met Pro Pro 
275 280 285 



60 



Pro Glu Glu Pro Leu 
290 
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(2) INFORMATION FOR SEQ ID NO: 314: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LE2IGTH: 68 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 314: 

Met Pro Leu Glu Gly Phe Cys Leu Val Leu Asp lie Gly Phe Leu Leu 
15 10 15 

Val Met Leu He Ser Leu Ala Ser Glu Cys Phe Thr Thr Cys Leu Asp 
20 25 30 

Ser Phe Ser Thr Thr Glu Pro Gly Cys Lys Phe Tyr Lys Leu Leu His 
35 40 45 

Ser Val Ser Leu Leu Asn He Asn Phe Asn Val Lys Ser Leu Leu Cys 
50 55 60 

Ser His He Xaa 
65 



(2) INFORMATION FOR SEQ ID NO: 315: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LQIGTH: 105 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 315: 

Met Pro Leu Gin Leu Ser Gly Gin Tyr Trp He Ser Leu Leu Veil Phe 
1 5 * 10 15 

Leu Ser Leu Gin Pro Phe Pro Gin Ala Ala He Pro Cys Ala Leu Thr 
20 25 30 

Asp Val Gly Gly Ser Cys Val He Cys His He Leu Leu Asn Cys Leu 
35 40 45 

Cys He Leu Fhe Thr Leu Thr Ala E*ro Ser Leu Ser His Val Leu Leu 
50 55 60 

He Lys Met Ser Leu Ser Val Cys Tyr Glu Pro Gly Ala Asp Leu Ser 
65 70 75 80 

Asp Arg Ala Ala Thr Gly Asn Lys Lys Leu Thr Arg Ser Thr Cys Leu 
85 90 95 

Leu Met His Ser Asn Lys Leu Cys 3Caa 
100 105 



(2) INPORMATIQN FOR SEQ ID NO: 316: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 71 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 316: 

Met Trp Gly Cys Ser Gly Leu Gly His Arg Thr Val Ser Phe Leu Leu 
15 10 15 

Leu Leu Pro Cys Ser Phe Pro Arg Pro Cys Xaa Leu Phe Gly Leu lie 
20 25 30 

Pro lie Ser Arg Pro Cys Lys Val Glu Ala Pro Arg Leu Ser Val Pro 
35 40 45 

Xaa Leu Ser Cys Ala Ser His Pro Tyr Cys Asn Cys Pro Met Ser Thr 
50 55 60 

Ser Cys Pro Leu Pro Arg Xaa 
65 • 70 



(2) INFORMATION FOR SEQ ID NO: 317: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 amino acids 

(B) TYPE: amino aicid 
(D) TOPOLOGY: li ne a r 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 317: 

Met Leu Asn Val Leu Ser Lys Val Gin Gin Leu Val Ser Xaa Leu Gly 
1 5 10 * 15 

Leu Val Thr Phe Leu Leu Asn His Ser Ala Ala Gly Gly Ser Pro Gin 
20 25 30 

His Arg Trp Leu Leu Leu Xaa 
35 



(2) INFORMATION FOR SEQ ID NO: 318: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LQIGTH: 72 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTIGN: SEQ ID NO: 318: 

Met Lys Ala He Ala Arg Ala Cys Leu Leu Leu Ser Leu Leu Val Leu 
15 10 15 

Pro His Val Val Ser Glu His Leu Phe Trp His His Asn Pro Arg His 
20 25 30 



Pro Val He Trp Pro Phe Pro Pro Phe His Leu He Ser Cys Ser Val 
35 40 45 
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Ser Ala Ser Thr Trp His Leu Gly Glu Xaa Leu Leu Ijeu Leu Val Pro 
50 55 60 

lie Ala Pro Ser Val Trp Ser Xaa 
65 70 



(2) INFORMATION FOR SEQ ID NO: 319: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 62 axoino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 319: 



Met Glu Gin Gly Gly Gly Pro Arg 
1 5 

Leu His Asn Thr Tyr Leu Ala Arg 
20 

Thr Thr Glu Asn Thr Glu Cys Gin 
35 40 

Leu Gly Lys Val Arg Ser Leu Asp 
50 55 



Leu Leu Leu Leu lie Pro Gly Leu 
10 15 

Pro Gly Asp Phe Pro Ala Gin Gly 
25 30 

Gly Ser Pro Ser Pro lie Ser His 
45 

Ser Asn Thr Gin lie Xaa 
60 



(2) INFORMATION FOR SEQ ID NO: 320: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LQ4GTH: 286 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 320: 

Met Pro Leu Leu Phe E^e Ser Val Ser Thr Leu Phe Ser Gly Ser Val 
15 10 15 

Thr Leu Gin Gin Arg Gly Met Phe Leu Pro Trp Thr Gly Thr Gly Glu 
20 25 30 

Gin Val Leu Ala Leu Leu Trp Pro Arg Phe Glu Leu lie Leu Glu Met 
35 40 45 

Asn Val Gin Ser Val Arg Ser Thr Asp Pro Gin Arg Leu Gly Gly Leu 
50 55 60 

Asp Thr Arg Pro His Tyr He Thr Arg Arg Tyr Ala Glu Phe Ser Ser 
65 70 75 80 

Ala Leu Val Ser He Asn Gin Thr He Pro Asn Glu Arg Thr Met Gin 
85 90 95 

Leu Leu Gly Gin Leu Gin Val Glu Val Glu Asn Phe Val Leu Arg Val 
100 105 110 



Ala Ala Glu Phe Ser Ser Arg Lys Glu Gin Leu Val Phe Leu He Asn 
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115 120 125 

Asn Tyx Asp Met Met Leu Gly Val Leu Met Glu Arg Ala Ala Asp Asp 
130 135 140 

5 

Ser Lys Glu Val Glu Ser Phe Gin Gin Leu Leu Asn Ala Arg Thr Gin 
145 150 155 160 

Glu Phe lie Glu Glu Leu Leu Ser Pro Pro Phe Gly Gly Leu Val Ala 
10 165 170 175 

Phe Val Lys Glu Ala Glu Ala Leu lie Glu Arg Gly Gin Ala Glu Arg 
180 185 190 

15 Leu Arg Gly Glu Glu Ala Arg Val Thr Gin Leu He Arg Gly Phe Gly 
195 200 205 

Ser Ser Trp Lys Ser Ser Val Glu Ser Leu Ser Gin Asp Val Met Arg 
210 215 220 

20 

Ser Phe Thr Asn Phe Arg Asn Gly Thr Ser He lie Gin Gly Ala Leu 
225 230 235 240 

Ttor Gin Leu He Gin Leu Tyr His Arg Phe His Arg Val Leu Ser Gin 
25 245 250 255 

Pro Gin Leu Arg Ala Leu Pro Ala Arg Ala Glu Leu He Asn He His 
260 265 270 

30 His Leu Met Val Glu Leu Lys Lys His Lys Pro Asn Phe Xaa 
275 280 285 



35 (2) INFORMATION FOR SEQ ID NO: 321: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 55 amino acids 

(B) TYPE: amino acid 
40 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SE3Q ID NO: 321: 

Met Phe Arg Ala Leu Arg Asp Leu Leu Thr His Tyr Pro Gin Gin He 
IS 10 15 

45 

Leu Leu Gin Val Leu Val Val Met Tyr Gin Val Leu Gin Val Trp Glu 
20 25 30 

Leu Pro Trp Pro Glu Leu He His Leu Gin Gly He Val Pro Thr Asp 
50 35 40 45 

Gin Leu His Leu Lys Gin Xaa 
50 55 

55 

(2) INPORMATION FOR SEQ ID NO: 322: 



60 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 59 amino acids 
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(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 322: 

Asp Phe Val Pro Val Leu Val Phe Val Leu He Lys Ala Asn Pro Pro 
15 10 15 

Cys Leu Leu Ser Thr Val Gin Tyr He Ser Ser Phe Tyr Ala Ser Cys 
20 25 30 

Leu Ser Gly Glu Glu Ser Tyr Trp Trp Met Gin Phe Thr Ala Ala Val 
35 40 45 

Glu Phe He Lys Thr He Asp Asp Arg Lys Xaa 
50 55 



(2) INFORMATIGN FOR SEQ ID NO: 323: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LE19GTH: 120 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 323: 

Met His Pro Ala Arg Lys Leu Leu Ser Leu Leu Phe Leu He Leu Met 
15 10 15 

Gly Thr Glu Leu Thr Gin Asp Ser Ala Ala Pro Asp Ser Leu Leu Arg 
20 25 30 

Ser Ser Lys Gly Ser Thr Arg Gly Ser Leu Ala Ala He Val He Trp 
35 40 45 

Arg Gly Lys Ser Glu Ser Arg He Ala Lys Thr Pro Gly He Phe Arg 
50 55 60 

Gly Gly Gly Thr I*eu Val Leu Pro Pro Thr His Thr Pro Glu Trp Leu 
65 70 75 80 

He Leu Pro Leu Gly He Thr Leu Pro Leu Gly Ala Pro Glu Thr Gly 
85 90 95 

Gly Gly Asp Cys Ala Ala Glu Thr Trp Lys Gly Ser Gin Arg Ala Gly 
100 105 110 

Gin Leu Cys Ala Leu Leu Ala Xaa 
115 120 



(2) INFORMATION FOR SEQ ID NO: 324: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 324: 
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Phe Phe Leu Val Val Phe Ser Leu Ser Phe Xaa Pro Ser Val Leu Thr 
15 10 15 

Ser Pro Val His Xaa Pro His Cys Cys Gin Xaa Asp Xaa lie Leu Phe 
20 25 30 

Lys Asn Thr Leu Xaa Xaa Phe Xaa Ala Lys Tyr Xaa 
35 40 



(2) INFORMATION FOR SEQ ID NO: 325: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 59 andno acids 

(B) TVPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTIC»I : SEQ ID NO: 325: 

Met Phe Ser Arg Thr Ser Asn Phe Trp Thr Phe Phe Phe Gin Phe Leu 
1 5 10 15 

lie Phe Lys Val Phe Leu Val Leu Lys Asn Xcia Phe Thr Ser Gin Lys 
20 25 30 

lie Xaa Xaa He Xaa Xaa Glu Lys Pro Lys Lys Lys Lys Xaa Arg Gly 
35 40 45 

Gly Arg Ala Pro Ser Pro Gin Gly Gly Pro Xaa 
50 55 



(2) INFORMATION FOR SEQ ID NO: 326: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LQK?rH:_18 amino acids 

(B) TYPE: amino aicid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 326: 

Met Gly Leu Leu He Phe Met Leu Leu He Gly He His Ser Gin Cys 
15 10 15 

Ser Xaa 



(2) INFORMATION FOR SEQ ID NO: 327: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 87 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 327: 



Met Val Leu Phe Cys Phe Veil Leu Phe Cys Phe Val Phe Glu Met Asp. 
15 10 15 



wo 98/54963 



531 



Ser Ser Ser Val Thr Gin Ala Gly Val Gin Trp Cys Asp Leu Gly Ser 
20 25 30 

Leu Gin Ala Pro Pro Pro Gly Phe Ser Pro Phe Ser Cys Leu Ser Leu 
35 40 45 

Pro Ser Ser Trp Asp Tyr Arg Arg Pro Pro Pro Arg Pro Ala Asn Phe 
50 55 60 

Leu Tyr Phe Leu Val Glu Thr Gly Phe His His Val Ser Gin Asp Gly 
65 70 75 80 

Leu Asp Leu Leu Thr Ser Xaa 
85 



(2) INFORMATION FOR SEQ ID NO: 328: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 538 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTICN: SEQ ID NO: 328: 

Met Ser Thr Lys Lys Leu Cys lie Val Gly Gly He Leu Leu Val Phe 
15 10 15 

Gin He He Ala Phe Leu Val Gly Gly Leu He Ala Pro Gly Pro Thr 
20 25 30 

Thr Ala Val Ser Tyr Met Ser Val Lys Cys Val Asp Ala Arg Lys Asn 
35 40 45 

His His Lys Thr Lys Trp Phe Val Pro Trp Gly Pro Asn His Cys Asp 
50 55 60 

Lys He Arg Asp He Glu Glu Ala He Pro Arg Glu He Glu Ala Asn 
65 70 75 80 

Asp He Val Phe Ser Val His He Pro Leu Pro His Met Glu Met Ser 
85 90 95 

Pro Trp Phe Gin Phe Met Leu Phe He Leu Gin Leu Asp He Ala Phe 
100 105 110 

Lys Leu Asn Asn Gin He Arg Glu Asn Ala Glu Val Ser Met Asp Val 
115 120 125 

Ser Leu Ala Tyr Arg Asp Asp Ala Phe Ala Glu Trp Thr Glu Met Ala 
130 135 140 

His Glu Arg Val Pro Arg Lys Leu Lys Cys Thr Phe Thr Ser Pro Lys 
145 150 155 160 

Thr Pro Glu His Glu Gly Arg Tyr Tyr Glu Cys Asp Val Leu Pro Phe 
165 170 175 

Met Glu He Gly Ser Val Ala His Lys Phe Tyr Leu Leu Asn He Arg 
180 185 190 



wo 98/54963 



532 



Leu Pro Val Asn Glu Lys Lys Lys lie Asn Val Gly lie Gly Glu lie 
195 200 205 

Lys Asp lie Arg Leu Val Gly lie His Gin Asn Gly Gly Phe Thr Lys 
210 215 220 

Val Trp Phe Ala Met Lys Thr Phe Leu Thr Pro Ser lie Phe He He 
225 230 235 240 

Met Val Trp Tyr Trp Arg Arg He Thr Met Met Ser Arg Pro Pro Val 
245 250 255 

Leu Leu Glu Lys Val He Phe Ala Leu Gly He Ser Met Thr Phe He 
260 265 270 

Asn He Pro Val Glu Trp Phe Ser He Gly Phe Asp Trp Thr Trp Met 
275 280 285 

Leu Leu Phe Gly Asp He Arg Gin Gly He Phe Tyr Ala Met Leu Leu 
290 295 300 

Ser Phe Trp He He Phe Cys Gly Glu His Met Met Asp Gin His Glu 
305 310 315 320 

Arg Asn His He Ala Gly Tyr Trp Lys Gin Val Gly Pro He Ala Val 
325 330 335 

Gly Ser Phe Cys Leu Phe He Phe Asp Met Cys Glu Arg Gly Val Gin 
340 345 350 

Leu Thr Asn Pro Phe Tyr Ser He Trp Thr Thr Asp He Gly Thr Glu 
355 360 365 

Leu Ala Met Ala Phe He He Val Ala Gly He Cys Leu Cys Leu Tyr 
370 375 380 

Phe Leu Phe Leu Cys Phe Met Val Phe Gin Val Phe Arg Asn He Ser 
385 390 395 400 

Gly Lys Gin Ser Ser Leu Pro Ala Met Ser Lys Val Arg Arg Leu His 
405 410 415 

Tyr Glu Gly Leu He Phe Arg Phe Lys Phe Leu Met Leu He Thr Leu 
420 425 430 

Ala Cys Ala Ala Met Thr Val He Phe Phe He Val Ser Gin Val Thr 
435 440 445 

Glu Gly His Trp Lys Trp Gly Gly Val Thr Val Gin Val Asn Ser Ala 
450 455 460 

Phe Phe Thr Gly He Tyr Gly Met Trp Asn Leu Tyr Val Phe Ala Leu 
465 470 475 480 

Met Phe Leu Tyr Ala Pro Ser His Lys Asn Tyr Gly Glu Asp Gin Ser 
485 490 495 



Asn Gly Met Gin Leu Pro Cys Lys Ser Arg Glu Asp Cys Ala Leu Phe 
500 505 510 
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Val Ser Glu Leu Tyr Gin Glu Leu Phe Ser Ala Ser Lys Tyr Ser Phe 
515 520 525 

He Asn Asp Asn Ala Ala Ser Gly He Xaa 
530 535 



(2) INFX3RMATI0N FOR SBQ ID NO: 329: 

(i) SEQUENCE CHARACTE3<ISTICS: 

(A) LENGTH: 202 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 329: 

Met Gly He Ala Leu Ala Val Leu Gly Trp Leu Ala Val Met Leu Cys 
15 10 15 

Cys Ala Leu Pro Met Trp Arg Val Thr Ala Phe He Gly Ser Asn He 
20 25 30 

Val Thr Ser Gin Thr He Trp Glu Gly Leu Trp Met Asn Cys Val Val 
35 40 45 

Gin Ser Thr Gly Gin Met Gin Cys Lys Val Tyr Asp Ser Leu Leu Ala 
50 55 60 

Leu Pro Gin Asp Leu Gin Ala Ala Arg Ala Leu Val He He Ser He 
65 70 75 80 

He Val Ala Ala Leu Gly Val Leu Leu Ser Val Val Gly Gly Lys Cys 
85 90 95 

Thr Asn Cys Leu Glu Asp Glu Ser Ala Lys Ala Lys Thr Met He Val 
100 105 110 

Ala Gly Val Val Phe Leu Leu Ala Gly Leu Met Val He Val Pro Val 
115 120 125 

Ser Trp Thr Ala His Asn He He Gin Asp Phe Tyr Asn Pro Leu Val 
130 135 140 

Ala Ser Gly Gin Lys Arg Glu Met Gly Ala Ser Leu Tyr Val Gly Trp 
145 150 155 160 

Ala Ala Ser Gly Leu Leu Leu Leu Gly Gly Gly Leu Leu Cys Cys Asn 
165 170 175 

Cys Pro Pro Arg Thr Asp Lys Pro Tyr Ser Ala Lys Tyr Ser Ala Ala 
180 185 190 

Arg Ser Ala Ala Ala Ser Asn Tyr Val Xaa 
195 200 



(2) INFORMATION FOR SEQ ID NO: 330: 



wo 98/54963 



PCT/US98/11422 



534 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 263 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 330: 

Met Ala Thr Val Thr Ala Thr Thr Lys Val Pro Glu He Arg Asp Val 
15 10 15 

10 ,Thr Arg He Glu Arg He Gly Ala His Ser His He Arg Gly Leu Gly 
20 25 30 



15 



30 



45 



Leu Asp Asp Ala Leu Glu Pro Arg Gin Ala Ser Gin Gly Met Val Gly 
35 40 45 

Gin Leu Ala Ala Arg Arg Ala Ala Gly Val Val Leu Glu Met He Arg 
50 55 60 



Glu Gly Lys He Ala Gly Arg Ala Val Leu He Ala Gly Gin Pro Gly 
20 65 70 75 80 

Thr Gly Lys Thr Ala He Ala Met Gly Met Ala Gin Ala Leu Gly Pro 
85 90 95 

25 Asp Thr Pro Phe Thr Ala He Ala Gly Ser Glu He Phe Ser Leu Glu 
100 105 110 



Met Ser Lys Thr Glu Ala Leu T&r Gin Ala Phe Arg Arg Ser He Gly 
115 120 125 

Val Arg He Lys Glu Glu Ttir Glu He He Glu Gly Glu Val Val Glu 
130 135 140 



He Gin He Asp Arg Pro Ala Thr Gly Thr Gly Ser Lys Val Gly Lys 
35 145 150 155 160 

Leu Thr Leu Lys Thr Thr Glu Met Glu Thr He Tyr Asp Leu Gly Thr 
165 170 175 

40 Lys Met He Xaa Ser Leu Thr Lys Asp Lys Val Gin Ala Gly Asp Val 
180 185 190 



He Thr He Asp Lys Ala Thr Gly Lys He Ser Lys Leu Gly Arg Ser 
195 200 205 

Phe Thr Arg Ala Arg Glu Leu Arg Arg Tyr Gly Leu Pro Asp Gin Val 
210 215 220 



Arg Ala Val Pro Arg Trp Gly Ala Pro Glu Thr Gin Gly Gly Gly Ala 
50 225 230 235 240 

His Arg Val Pro Ala Arg Asp Arg Arg His Gin Leu Ser His Pro Gly 
245 250 255 

55 Leu Pro Gly Ala Leu Leu Arg 
260 



60 (2) INFORMATION FOR SEQ ID NO: 331: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 260 amino acids 

(B) TYPE: amino acid 
5 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 331: 

Met Leu Ala Leu Leu Gly Leu Ser Gin Ala Leu Asn lie Leu Leu Gly 
15 10 15 

10 

Leu Lys Gly Leu Ala Pro Ala Glu lie Ser Ala Val Cys Glu Lys Gly 
20 25 30 

Asn Phe Asn Val Ala His Gly Leu Ala Trp Ser Tyr Tyr He Gly Tyr 
15 35 40 45 

Leu Arg Leu He Leu Pro Glu Leu Gin Ala Arg He Arg Thr Tyr Asn 
50 55 60 



20 



25 



Gin His Tyr Asn Asn Leu Leu Arg Gly Ala Val Ser Gin Arg Leu Tyr 
65 70 75 • 80 

He Leu Leu Pro Leu Asp Cys Gly Val Pro Asp Asn Leu Ser Met Ala 
85 90 95 

Asp Pro Asn He Arg Phe Leu Asp Lys Leu Pro Gin Gin Thr Gly Asp 
100 105 110 



Arg Ala Gly He Lys Asp Arg Val Tyr Ser Asn Ser He Tyr Glu Leu 
30 115 120 125 



35 



40 



Leu Glu Asn Gly Gin Arg Ala Gly Thr Cys Val Leu Glu Tyr Ala Thr 
130 135 140 

Pro Leu Gin Thr Leu Phe Ala Met Ser Gin Tyr Ser Gin Ala Gly Phe 

150 155 160 

Ser Gly Glu Asp Arg Leu cfu Gin Ala Lys Leu Phe Cys Arg Thr Leu 
165 170 175 

Glu Asp He Leu Ala Asp Ala Pro Glu Ser Gin Asn Asn Cys Arg Leu 
180 185 190 



He Ala Tyr Gin Glu Pro Ala Asp Asp Ser Ser Phe Ser Leu Ser Gin 
45 195 200 205 

Glu Val Leu Arg His Leu Arg Gin Glu Glu Lys Glu Glu Val Thr Val 
210 215 220 

50 Gly Ser Leu Lys Thr Ser Ala Val Pro Ser Thr Ser Thr Met Ser Gin 
225 230 235 240 

Glu Pro Glu Leu Leu He Ser Gly Met Glu Lys Pro Leu Pro Leu Arg 
2^ 245 250 255 

Thr Asp Phe Ser 
260 



60 
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(2) INFORMATION FOR SEQ ID NO: 332: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 48 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 332: 

Met Thr Pro Gin Lys Pro Ala Leu Ala Val Leu Leu Leu Glu Val Pro 
15 10 15 

Leu Leu Leu Thr Leu Ser Val Leu Lys Lys Arg Cys Leu Val Thr Cys 
20 25 30 

Glu Pro Thr Ser Arg Phe Val Ser Cys Asp Leu Pro Leu Ser Val Xaa 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 333: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 334 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 333: 

Met Ala Ala Ala Ala Trp Leu Gin Val Leu Pro Val lie Leu Leu Leu 
15 10 15 

Leu Gly Ala His Pro Ser Pro Leu Ser Phe Phe Ser Ala Gly Pro Ala 
20 25 30 

Thr Val Ala Ala Ala Asp Arg Ser Lys Trp His He Pro He Pro Ser 
35 40 45 

Gly Lys Asn Tyr Phe Ser Phe Gly Lys He Leu Phe Arg Asn Thr Thr 
50 55 60 

He Phe Leu Lys Phe Asp Gly Glu Pro Cys Asp Leu Ser Leu Asn He 
65 70 75 80 

Thr Trp Tyr Leu Lys Ser Ala Asp Cys Tyr Asn Glu He Tyr Asn Phe 
85 90 95 

Lys Ala Glu Glu Val Glu Leu Tyr Leu Glu Lys Leu Lys Glu Lys Arg 
100 105 110 

Gly Leu Ser Gly Lys Tyr Gin Thr Ser Ser Lys Leu Phe Gin Asn Cys 
115 120 125 

Ser Glu Leu Phe Lys Thr Gin Thr Phe Ser Gly Asp Phe Met His Arg 
130 135 140 



Leu Pro Leu Leu Gly Glu Lys Gin Glu Ala Lys Glu Asn Gly Thr Asn. 
145 150 155 160 
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Leu Thr Phe lie Gly Asp Lys Thr Ala Met His Glu Pro Leu Gin Thr 
165 170 175 

Trp Gin Asp Ala Pro Tyr lie Phe lie Val His He Gly He Ser Ser 
5 180 185 190 

Ser Lys Glu Ser Ser Lys Glu Asn Ser Leu Ser Asn Leu Phe Thr Met 
195 200 205 

10 Thr Val Glu Val Lys Gly Pro Tyr Glu Tyr Leu Thr Leu Glu Asp Tyr 
210 215 220 



15 



30 



35 



55 



Pro Leu Met He Phe Phe Met Val Met Cys He Val Tyr Val Leu Phe 
225 230 235 240 

Gly Val Leu Trp Leu Ala Trp Ser Ala Cys Tyr Trp Arg Asp Leu Leu 
245 250 255 



Arg He Gin Phe Trp He Gly Ala Val He Phe Leu Gly Met Leu Glu 
20 260 265 270 

Lys Ala Val Phe Tyr Ala Glu Phe Gin Asn He Arg Tyr Lys Gly Xaa 
275 280 - 285 

25 Ser Val Gin Gly Ala Leu He Leu Ala Glu Leu Leu Ser Ala Val Lys 
290 295 300 



Arg Ser Leu Ala Arg Thr Leu Val He He Val Ser Leu Gly Tyr Gly 
305 310 315 320 

He Val Lys Pro Arg Leu Glu Ser Leu Phe He Arg Leu Xaa 
325 330 



(2) INFORMATION FOR SEQ ID NO: 334: 



(i) SEQUENCE CHARACTERISTICS: 

<A) liEWGTH: 200 amino acids 
40 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 334: 

Met Val Leu Xaa Val Val Thr Leu Gly Leu Ala Leu Phe Thr Leu Cys 
45 1 5 10 15 

Gly Lys Phe Lys Arg Trp Lys Leu Asn Gly Ala Phe Leu Leu He Thr 
20 25 30 

50 Ala Phe Leu Ser Val Leu He Trp Val Ala Trp Met Thr Met Tyr Leu 
35 40 45 

Phe Gly Asn Val Lys Leu Gin Gin Gly Asp Ala Trp Asn Asp Pro Thr 
50 55 60 



Leu Ala He Thr Leu Ala Ala Ser Ala Gly Ser Ser Ser Ser Ser Thr 
65 70 75 80 



Pro Ser Leu Arg Ser Thr Ala Pro Phe Cys Gin Pro Cys Arg Arg Thr 
60 85 90 95 
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Arg Pro Thr Thr Ser Thr Arg Arg Ser Pro Gly Cys Gly Arg Arg Pro 
100 105 110 

Ser Arg Arg Thr Cys Ser Cys Arg Gly Pro lie Trp Arg Thr Arg Pro 
115 120 125 

Ser Pro Trp Met Asn Thr Met Gin Leu Ser Glu Gin Gin Asp Phe Pro 
130 135 140 

Thr Ala Ala Trp Glu Lys Asp Pro Val Ala Ala Trp Gly Lys Asp Pro 
145 150 155 160 

Ala Leu Arg Leu Glu Ala Thr Cys lie Ser Gin Leu Arg Trp Pro Ser 
165 170 175 

Cys Ser Thr Val Gly Pro Ser Gin Leu Leu Arg Gin Val Thr Gin Glu 
180 185 190 

Xaa Thr Phe Gly Glu Arg Leu Xaa 
195 200 



(2) INFORMATICN FOR SEQ ID NO: 335: 

(i) SEC^JENCE CHARACTERISTICS: 

(A) LENGTH: 24 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DE9CRIPTICN: SEQ ID NO: 335: 

Met Leu Leu His His Gin Leu Leu He Val Thr Leu His Leu Val Leu 
15 10 15 

Leu Leu Ala Thr Leu Leu Val Xaa 
20 



(2) INFORMATION FOR SEQ ID NO: 336: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 143 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 336: 

Met Thr Lys Ala Leu Leu He Tyr Leu Val Ser Ser Phe Leu Ala Leu 
15 10 15 

Asn Gin Ala Ser Leu He Ser Arg Cys Asp Leu Ala Gin Val Leu Gin 
20 25 30 

Leu Glu Asp Leu Asp Gly Phe Glu Gly Tyr Ser Leu Ser Asp Trp Leu 
35 40 45 



Cys Leu Ala Phe Val Glu Ser Lys Phe Asn He Ser Lys He Asn Glu 
50 55 60 
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Asn Ala Asp Gly Ser Phe Asp Tyr Gly Leu Phe Gin lie Asn Ser His 
65 70 75 80 

Tyr Trp Cys Asn Xaa Tyr Lys Ser Tyr Ser Glu Asn Leu Cys His Val 
85 90 95 

Asp Cys Gin Asp Leu Leu Asn Pro Asn Leu Leu Ala Gly He His Cys 
100 105 110 

Ala Lys Arg He Val Ser Gly Ala Arg Gly Met Asn Asn Trp Val Arg 
115 120 125 

Met Glu Xaa Cys Thr Val Gin Ala Gly His Ser Ser Thr Gly Xaa 
130 135 140 



(2) INFORMATIC3N FOR SEQ ID IK): 337: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 95 amino acids 

(B) TYPE: amino acid 
(D) TOPOLCXTf: linear 

(xi) SEQUENCE DESCRIPTICN: SEQ ID NO: 337: 

Met Leu Val He Ala Gly Gly He Leu Ala Ala Leu Leu Leu Leu He 
15 10 15 

Val Val Val Leu Cys Leu Tyr Kie Lys He His Asn Ala Leu Lys Ala 
20 25 30 

Ala Lys Glu Pro Glu Ala Val Ala Val Lys Asn His Asn Pro Asp Lys 
35 40 45 

Val Trp Trp Ala Lys Asn Ser Gin Ala Lys Thr He Ala Thr Glu Ser 
50 55 60 

Cys Pro Ala Leu Gin Cys Cys Glu Gly Tyr Arg Met Cys Ala Ser Phe 
65 70 75 80 

Asp Ser Leu Pro Pro Cys Cys Cys Asp He Asn Glu Gly Leu Xaa 
85 90 95 



(2) INFORMATION FOR SEQ ID NO: 338: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LEtlGTH: 38 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTIOT: SEQ ID NO: 338: 



Met Leu Leu Lys Ser Asn He Leu Met Leu Asn Leu Phe Ala Ala Asn 
15 10 15 

Val Gly Ala Asn Phe Ala Leu Thr Val Glu Lys He Gly Met He Leu 
20 25 30 

Leu Asn Val Ser Gly Xaa 
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35 



(2) INFORMATION FOR SEQ ID NO: 339: 

(i) SEQUENCE CHARACTERIOTICS : 

(A) LENGTH: 39 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(Xi> SEQUENCE DESCRIPTION: SEQ ID NO: 339: 

Met Leu Val Val Ala Phe Gly Leu Leu Val Leu Tyr He Leu Leu Ala 
15 10 15 

Ser Ser Trp Lys Arg Pro Glu Pro Gly He Leu Thr Asp Arg Gin Pro 
20 25 30 

Leu Leu His Asp Gly Glu Xaa 



(2) INFORMATION FOR SEQ ID NO: 340: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 71 amino acids 

(B) TlfPE: amiiK> acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 340: 

Ser Asp Pro Leu Ala Ser Ala Ser Gin Asn Ala Gly He Val Ser Val 
1 5 10 15 

Gly Leu Cys Thr Arg Pro Gly Pro Gin Phe Lys Asn Ala Gin Pro Pro 
20 25 30 

Phe Pro Xaa Gin Lys Ala Pro Arg Cys Leu Trp Glu Asn Gin Pro Pro 
35 40 45 

Pro Trp Arg Lys Ala Trp Asp Leu E*ro Ser His Leu Gly Arg Arg Gly 
50 55 60 

He Cys Gly Lys Ser Phe Xaa 
65 70 



(2) INFORMATION FOR SEQ ID NO: 341: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LETJGTH: 85 amino acids 

<B) TVPE: amino acid 

<D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 341: 

Tyr Val Met He Phe Lys Lys Glu Phe Ala Pro Ser Asp Glu Glu Leu 
15 10 15 

Asp Ser Tyr Arg Arg Gly Glu Glu Trp Asp Pro Gin Lys Ala Glu Glu 
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20 25 30 

Lys Arg Asn Xaa Lys Glu Leu Ala Gin Arg Gin Xaa Gly Gly Gly Ser 
35 40 45 

Pro Ala Gly Ala Cys Gly Gly Glu Pro Cys Gin Arg Leu Gin Gly Gin 
50 55 60 

Val Gin Pro Pro His Arg Gin Gly Ser Ser Gin Arg Arg Ser Pro His 
65 70 75 80 

Ala Thr Gly Gin Xaa 
85 



(2) INFORMATION FOR SEQ ID NO: 342: 

(i) SEQUENCE OCXRACTERISTICS : 

(A) LEJGTH; 90 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 342: 

Met Trp Asp Trp Asp Trp Ser Ala Pro Trp Ser Trp Pro Leu Trp Leu 
15 10 15 

Ser Leu Ala Leu Val Cys Leu Ser Ala Gly Ala Lys Gly His Arg Ala 
20 25 30 

Ser Glu Ala Gly His Ala Arg Ala Leu Thr Cys Glu Met Gly Ser Glu 
35 40 45 

Efte Xaa Thr Ala Xaa Gly Leu Val Leu Gly Xaa Xaa Xaa Trp Thr Xaa 
50 55 60 

Xaa Asn Gly Ser Ala Gly Pro Glu Arg Arg Gly Trp Arg Pro Ala Ala 
65 70 75 80 

Phe Leu Ala Val Phe Leu Leu Gly Asp Xaa 
85 90 



(2) INFORMATION FOR SEQ ID NO: 343: 

(i) SEQUENCE CHARACTERISTICS: 

(A) L0IGTH: 48 amino acids 

(B) TYPE: aicdno acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 343: 

Met Phe Gly Pro Thr Phe His Ser Leu Val Leu Val Pro Pro Trp Pro 
15 10 15 

Asn Leu Ser Leu Leu His Phe Thr Ser Pro Val Gly Gin His Ser Ser 
20 25 30 



Phe Leu Pro Thr Ser Leu Arg Leu Xaa Lys Lys Lys Lys Lys Lys Lys 
35 40 45 
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(2) INFORMATION FOR SEQ ID NO: 344: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 56 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 344: 

Met Cys Ser Lys Asn Gly Phe Leu Leu Ala Trp Ser Trp Asn Ser Pro 
15 10 15 

Trp Leu Pro Gin Ala Ser Leu Ala His Gly Cys Trp Gly Arg Trp Met 
20 25 30 

Ser Asp Leu Val Gly Cys Ser Arg Glu Asn Lys Cys Ala Leu Arg Asp 
35 40 45 

His Ser Glu Arg Val Gin Gly Xaa 
50 55 



(2) INFORMATION FOR SEQ ID NO: 345: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGrni: 222 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 345: 

Ser Pro Leu Xaa Phe Cys Val Val Leu Leu Leu Gin Ala Ala Arg Gly 
1 5 . 10 15 

Tyr Val Val Airg Lys Pro Ala Gin Ser Arg Leu Asp Asp Asp Pro Pro 
20 25 30 

Pro Ser Thr Leu Leu Lys Asp Tyr Gin Asn Val Pro Gly lie Glu Lys 
35 40 45 

Val Asp Asp Val Val Lys Arg Leu Leu Ser Leu Glu Met Ala Asn Lys 
50 55 60 

Lys Glu Met Leu Lys He Lys Gin Glu Gin Phe Met Lys Lys He Val 
65 70 75 80 

Ala Asn Pro Glu Asp Thr Arg Ser Leu Glu Ala Arg He He Ala Leu 
85 90 95 

Ser Val Lys He Arg Ser Tyr Glu Glu His Leu Glu Lys His Arg Lys 
100 105 110 

Asp Lys Ala His Lys Arg Tyr Leu Leu Met Ser He Asp Gin Arg Lys* 
115 120 125 
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Lys Met Leu Lys Asn Leu Arg Asn Thr Asn Tyr Asp Val Phe Glu Lys 
130 135 140 

He Cys Trp Gly Leu Gly He Glu Tyr Thr Phe Pro Pro Leu Tyr Tyr 
145 150 155 160 

Arg Arg Ala His Arg Arg Phe Val Thr Lys Lys Ala Leu Cys lie Arg 
165 170 175 

Val Phe Gin Glu Thr Gin Lys Leu Lys Lys Arg Arg Arg Ala Leu Lys 
180 185 190 

Ala Ala Ala Ala Ala Gin Lys Gin Ala Lys Arg Arg Asn Pro Asp Ser 
195 200 205 

Pro Ala Lys Ala He Pro Lys Thr Leu Lys Asp Ser Gin Xaa 
210 215 220 



(2) INFORMATION FOR SEQ ID NO: 346: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 64 amino acids 

(B) TYPE: cunino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 346: 

Met Gly Ala Pro Ala Ala Ser Leu I*eu Leu Leu Leu Leu Leu Phe Ala 
15 10 15 

Cys Cys Trp Ala Pro Gly Gly Ala Asn Leu Ser Gin Asp Asp Ser Gin 
20 25 30 

Pro Trp Thr Ser Asp Glu Thr Val Val Ala Gly Gly Thr Val Val I^u 
35 40 45 

Lys Cys Gin Val Lys Asp His Glu Asp Ser Ser Leu Gin Trp Ser Xaa 
50 55 60 



(2) INFORMATION FOR SEQ ID NO: 347: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 154 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY; linear 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 347: 

Met Val Ala Pro Val Trp Tyr Leu Val Ala Ala Ala Leu Leu Val Gly 
15 10 15 

Phe He Leu Phe Leu Thr Arg Ser Arg Gly Arg Ala Ala Ser Ala Gly 
20 25 30 

Gin Glu Pro Leu His Asn Glu Glu Leu Ala Gly Ala Gly Arg Val Ala 
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35 40 45 

Gin Pro Gly Pro Leu Glu Pro Glu Glu Pro Arg Ala Gly Gly Arg Pro 
50 55 60 

Arg Arg Arg Arg Asp Leu Gly Ser Arg Leu Gin Ala Gin Arg Arg Ala 
65 70 75 80 

Gin Arg Val Ala Trp Ala Glu Ala Asp Glu Asn Glu Glu Glu Ala Val 
85 90 95 

lie Leu Ala Gin Glu Glu Glu Gly Val Glu Lys Pro Ala Glu Xaa His 
100 105 110 

Leu Ser Gly Lys He Gly Ala Lys Lys Leu Arg Xaa Xaa Glu Glu Lys 
115 120 125 

Gin Ala Arg Lys Ala Gin Xaa Glu Ala Glu Glu Ala Glu Arg Glu Xaa 
130 135 140 

Arg Lys Arg Leu Glu Ser Gin Arg Glu Xaa 
145 150 



(2) INFORMATION FOR SEQ ID NO: 348: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LQ3GTH: 17 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: lineeur 

(xi) SEQUEtK3E DESCRIPTIC»I: SEQ ID NO: 348: 

Met Gin Lys Cys Met Leu Ser Ala Leu Val Phe His He Gin Trp Ser 
15 10 15 

Xaa 



(2) INFORMATION FOR SEQ ID NO: 349: 

(i) SEQUENCE CHARACTERISTICS: 

(A) L.QIGTH: 10 amino acids 

(B) TYPE: amino acLd 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 349: 

Met Leu Val Cys Ser Phe Leu Phe Leu Xaa 
15 10 



(2) INFORMATION FOR SEQ ID NO: 350: 

(i) SEQUEISKrE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 350: 

Val lie Glu Leu Cys Veil Ser Leu Arg Ser Leu Asn Phe Xaa 
15 10 



(2) INFORMATION FOR SEQ ID NO: 351: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENC?rH: 18 ainino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 351: 

Met Cys Glu Phe Xaa Xaa Xaa lie Met Xaa Leu Ala Gly Tyr Phe Ala 
15 10 15 

Cys Xaa 



(2) INFORMATION FOR SEQ ID NO: 352: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 62 amino acids 

(B) TYPE: ainino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTICN: SEQ ID NO: 352: 

Met Val Gly Gly Tyr Val Ser Ser Phe Ser Phe Pro Pro Val Ser Ser 
15 10 15 

Ser Leu Leu Leu Pro Ala Ser Phe Ala Phe Pro Phe Leu Pro Gly Thr 
20 25 30 - 

Pro Cys Pro Phe Leu Tyr Phe Leu Pro Ser Pro Phe Ser Pro Leu Pro 
35 40 45 

Leu Ser Leu Thr Arg Ser Asn Ser Phe Leu Leu Asn Gly Xaa 
50 55 60 



(2) INFORMATICN FOR SEQ ID NO: 353: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 353: 

Glu Lys Lys Ser Met Ser Val Ser Asp lie Tyr Ala Leu Glu Ser Leu 
15 10 15 

Gly Arg Ser Leu Phe Thr Leu Asn Ser Met Cys Leu Pro Leu Ser Phe 
20 25 30 

Xaa 
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5 (2) INFORMATION FOR SEQ ID NO: 354: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 245 amino acids 

(B) TYPE: amino acid 
10 (D) TOPOIjOGY: linear 

(xi) SEQUENCE OESCRIPTION: SEQ ID NO: 354: 



15 



Met Gly Gly Ala Ser Arg Arg Val Glu Ser Gly Ala Trp Ala Tyr Leu 
15 10 15 

Ser Pro Leu Val Leu Arg Lys Glu Leu Glu Ser Leu Val Glu Asn Glu 
20 25 30 



Gly Ser Glu Val Leu Ala Leu Pro Glu Leu Pro Ser Ala His Pro lie 
20 35 40 45 

lie Phe Trp Asn Leu Leu Trp Tyr Phe Gin Arg Leu Arg Leu Pro Ser 
50 55 60 

25 He Leu Pro Gly Leu Val Leu Ala Ser Cys Asp Gly Pro Ser Xaa Ser 
65 70 75 80 

Gin Ala Pro Ser Pro Trp Leu Thr Prx) Asp Pro Ala Ser Val Gin Val 
85 90 95 

30 

Arg Leu Leu Trp Asp Val Leu Thr Pro Asp Pro Asn Ser Cys Pro Pro 
100 105 110 

Leu Tyr Val Leu Trp Arg Val His Ser Gin He Pro Gin Arg Val Val 
35 115 120 125 

Trp Pro Gly Pro Val Pro Ala Ser Leu Ser Leu Ala Leu Leu Glu Ser 
130 135 140 

40 Val Leu Arg His Val Gly Leu Asn Glu Val His Lys Ala Val Gly Leu 
145 150 155 160 

Leu Leu Glu Thr Leu Gly Pro Pro Pro Thr Gly Leu His Leu Gin Arg 
165 170 175 

45 

Gly He Tyr Arg Glu He Leu Phe Leu Thr Mfet Ala Ala Leu Gly Lys 
180 185 190 

Asp His Val Asp He Val Ala Phe Asp Lys Lys Tyr Lys Ser Ala Phe 
50 195 200 205 

Asn Lys Leu Ala Ser Ser Met Gly Lys Glu Glu Leu Arg His Arg Arg 
210 215 220 

55 Ala Gin Met Pro Thr Pro Lys Ala He Asp Cys Arg Lys Cys Phe Gly 
225 230 235 240 



60 



Ala Pro Pro Glu Cys 
245 
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<2) INFORMATION FOR SEQ ID NO: 355: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 amino acids 

(B) TYPE: amino acid 
(D) TOPOIiOGY: linear 

(xi) SEQUENCE DESCRIPTICN: SEQ ID NO: 355: 

Met Lys Phe Ser Leu Leu Phe Leu Pro Met Leu Leu lie Leu Lys Pro 
15 10 15 

Asp Leu Phe His lie Ser lie Cys Thr Leu Ala Ala Cys Gly Leu Thr 
20 25 30 

Phe Pro Xaa 
35 



(2) INFORMATION FOR SEQ ID NO: 356: 

(i) SBC^IENCE CHARACTERISTICS: 

(A) L0IGTH: 22 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE EESCRIPTION: SEQ ID NO: 356: 

Met Leu Phe Phe Phe lie Leu His Leu Leu Ser lie Met Ser Phe Leu 
15 10 15 

Ser Pro Asp lie Met Xaa 
20 



(2) INFORMATION FOR SEQ ID NO: 357: 

(i) SBQfUENCE CHARACTEEIISTICS : 

(A) I^ENGTH: 98 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEC^JENCE DESCRIPTICN: SBQ ID NO: 357: 

Met Phe Gly Leu Leu Val Glu Ser Gin Thr Leu Leu Glu Glu Asn Ala 
15 10 15 

Val Gin Gly Thr Glu Arg Thr Leu Gly Leu Asn lie Ala Pro Phe He 
20 25 30 

Asn Gin Phe Gin Val Pro He Arg Val Phe Leu Asp Leu Ser Ser Leu 
35 40 45 

Pro Cys He Pro Leu Ser Lys Pro Val Glu Leu Leu Arg Leu Asp Leu 
50 55 60 

Met Thr Pro Tyr Leu Asn Thr Ser Asn Arg Glu Val Lys Val Tyr VsUL 
65 70 75 80 
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Cys Xaa lie Trp Glu Asp Leu Thr Ala lie Pro Phe Trp Val Ser Tyr 
85 90 95 



Val Pro 



(2) INFORMATIC»I FOR SEQ ID tJO: 358: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 78 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 358: 

Met Phe Gly Ala His Arg Xaa Trp Gin Gly Ser Val Leu Leu Phe Leu 
15 10 15 

Ser Phe Ala Trp Gly Asn Gly Gly Ser Val Thr Phe Ser Asp Val Pro 
20 25 30 

Arg Val Met Pro Leu Ala Gly Gly Pro Xaa Xaa Gin Val Ser Ser Thr 
35 40 45 

Pro Arg Pro Pro Pro His Gin Val Thr Ser Ser Pro Gly Leu Glu Ser 
50 55 60 

Ala His lie Veil Cys Pro Glu Arg Lys Lys Lys Lys Lys Lys 
65 70 75 



(2) INFORMATION FOR SEQ ID NO: 359: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH:^ 31 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 359: 

Thr Leu Leu Xaa Phe Leu Xaa Leu Leu Thr Thr Glu Gly Gly Arg Glu 
15 10 15 

Asn lie Phe Xaa Gly Arg lie Leu Xaa Leu Gin Xaa Ser Pro Xaa 
20 25 30 



(2) INFORMATION FOR SEQ ID NO: 360: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 57 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 360: 



Met Leu Ser Phe Phe lie Cys Leu Leu lie Phe Val His Leu Leu Leu 
15 10 15 



